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B.  INTRODUCTION 

Interval-censored  (IC)  data  are  encountered  in  three  areas  of  breast  cancer  research. 
The  most  common  application  is  in  clinical  relapse  follow-up  studies  in  which  the  study 
endpoint  is  disease-free  survival.  When  a  patient  relapses,  it  is  usually  known  that  the 
relapse  takes  place  between  two  follow-up  visits,  and  the  exact  time  to  relapse  is  unknown. 
In  statistics,  we  say  relapse  time  is  interval  censored.  Interval  censoring  is  also  encountered 
in  breast  cancer  registry  studies  in  which  information  on  family  history  of  cancer  is  updated 
periodically.  The  Strang  Breast  Surveillance  Program  for  women  at  increased  risk  for  breast 
cancer,  for  instance,  has  enlisted  over  800  women  with  complete  pedigree  information  which 
is  verified  and  updated  continuously.  Family  history  data  such  as  age  at  diagnosis  of  a 
specific  cancer,  or  a  benign  but  risk-conferring  condition,  are  obtained  from  each  registrant 
at  each  update.  Time  to  a  cancer  event,  and  definitely  time  to  first  detection  of  a  benign 
condition,  are  at  best  known  to  fall  in  the  time  interval  between  the  last  update  and  age 
at  diagnosis.  A  third  but  increasingly  important  area  of  application  of  interval  censoring 
is  in  breast  cancer  chemoprevention  experiments  or  prevention  trials,  which  involve  the 
observation  of  one  or  more  surrogate  endpoint  biomarkers  (SEB)  over  time.  The  scientific 
questions  of  interest  here  are  estimation  of  time  for  an  SEB  to  reach  a  target  value,  and 
estimation  of  time  from  cessation  of  intake  of  a  chemopreventive  agent  to  the  loss  of  its 
protective  effect.  Unfortunately,  the  exact  values  of  both  these  time  variables  are  known 
only  to  lie  in  between  two  successive  assay  inspection  times.  In  a  breast  cancer  follow-up 
study,  we  often  encounter  covariates  (for  instance,  tumor  size  and  nodal  status  in  a  relapse 
study,  and  baseline  SEB  value  in  a  chemoprevention  trial).  A  regression  model  will  be 
needed  for  the  analysis  of  such  data. 

Let  X  denote  a  time-to-event  variable  with  distribution  F{x)  =  Pr{X  <  a;),  or  equiv¬ 
alently,  survival  function  S{x)  =  1  —  E(a:).  In  interval  censoring,  X  is  not  observed  and 
is  known  only  to  lie  in  an  observable  interval  (L,  P).  In  our  previous  DOD  funded  grant, 
we  have  made  fundamental  contributions  to  both  the  theory  of  the  generalized  Tnaxirnmn 
likelihood  (GML)  estimation  of  S,  and  the  computation  in  connection  with  the  inference  of 
GML  estimator  (GMLE)  S  of  S.  These  contributions  are  restricted  to  the  case  of  univariate 
IC  data  without  covariates. 

The  Cox  proportional  hazards  regression  model  [2]  specifies  that  covariates  have  a 
proportional  effect  on  the  hazard  function  of  X.  This  model  provides  powerful  means  for 
fitting  failure  time  observations  to  a  distribution  free  model  and  for  estimating  the  risk  for 
failure  associated  with  a  vector  of  covariates.  It  is  extensively  used  for  right-censored  data. 
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Finkelstein  [3]  applied  the  Cox  model  to  the  analysis  of  IC  data.  However,  she  did  not 
establish  asymptotic  properties  of  the  GMLE  of  the  parameters  in  the  model;  moreover,  she 
did  not  investigate  the  convergence  properties  of  the  Newton-Raphson  (NR)  algorithm  for 
finding  the  GMLE  values. 

Our  recent  interest  in  IC  data  with  covariates  is  driven  by  needs  arising  from  two  re¬ 
lated  areas  of  breast  cancer  research  at  Strang  Cancer  Prevention  Center.  First,  we  have 
been  conducting  a  long-term  prognostic  follow-up  study  of  breast  cancer  bone  marrow  mi¬ 
crometastasis  (BMM)  on  relapse,  involving  375  women.  We  shall  need  an  eflficient  algorithm 
to  find  the  GMLE  of  the  regression  coefficients  of  the  Cox  model.  Second,  we  have  just  com¬ 
pleted  a  one-year  confirmation  chemoprevention  trial  of  of  indole-3-carbinol  (I3C)  for  breast 
cancer  prevention.  In  this  prevention  trial  we  monitored  the  levels  of  two  SEB’s,  a  urinary 
estrogen  metaboUte  ratio  and  a  blood  counterpart,  both  of  which  are  subject  to  interval 
censoring.  An  earlier  dose-ranging  study  of  I3C  conducted  by  Wong  et  al  [4]  has  been 
published. 

The  overall  aim  of  this  research  proposal  is  to  develop  statistical  inference  for  IC  data 
with  covariates  that  are  encountered  in  breast  cancer  relapse  follow-up  studies,  breast  can¬ 
cer  chemoprevention  trials  employing  surrogate  endpoint  biomarkers,  and  in  breast  can¬ 
cer  registry  follow-up  studies  of  familial  aggregation  of  breast  and  other  forms  of  cancer. 
Asymptotic  generalized  maximum  likelihood  theory  under  the  Cox  regression  model  will 
be  investigated  and  computer  software  package  for  maximum  likelihood  inference  will  be 
implemented. 

C.  BODY 

C.l.  Model  Formulation  and  Likelihood  Equations. 

Let  Yk,i  <  Yk,2  <  •••  <  Yk,k  denote  the  follow-up  times  for  a  patient  who  has  made 
K  follow-up  visits,  in  a  longitudinal  follow-up  study.  Since  the  number  of  visits  for  each 
patient  may  vary,  AT  is  a  random  positive  integer.  For  convenience,  define  Yk,o  —  0  and 
Yk,k+i  =  oo.  The  time-to-event  variable  of  interest,  X,  is  not  directly  observed;  instead,  it 
is  known  to  lie  in  between  two  successive  censoring  time  points  (kjcj)  T/fj+i),  where  j  =0, 
...,  K.  Note  that  X  is  left  censored  if  j  =  0,  strictly  interval  censored  if  0  <  j  <  K,  and 
right  censored  if  A  >  Yk,k-  The  observable  IC  data  corresponding  to  A  is  given  by 

(L,  R)  =  {YK,i,  Yj^.i+i)  if  YK,i  <  A  <  i  =  0, 1, ...,  K.  (1) 

In  addition  to  (L,  R),  we  also  observe  a  p  x  1  covariate  vector  Z.  We  assume  that  K 
and  the  Yk,j^s  are  independent  of  (A,  Z). 
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The  Cox  regression  model  for  the  survival  function  at  X  =  a;  given  Z  =  2;  is  represented 
by 

S(xW  =  [S<,(x)]«‘*,  (2) 

where  is  the  dot  product  of  Z  and  So{x)  is  a  baseline  smvival  function  and  /3  is  a 
p-dimensional  regression  coefficient  vector. 

Let  li  =  (Li,Ri,Zi),  i  =  1,  n,  be  a  random  sample  of  size  n  interval-censored 
observations  with  covariates.  In  terms  of  the  original  observed  intervals,  the  likelihood 
function  of  S  and  h  is  given  by 

L  =  n((S(i,))'‘'‘  -  ).  (3) 

i=l 

where  5  is  a  survival  function,  and  6  is  a  p  x  1  dimensional  vector.  The  GMLE  of  {So,P)  is 
a  value  (5,  h)  that  maximizes  (3)  over  all  survival  functions  S  and  all  6  e  TV*. 

Since  So  places  all  probability  mass  on  the  innermost  intervals  of  the  Ji’s  (see  Peto 
[5]  or  Turnbull  [6]),  it  is  often  computationally  simpler  to  express  L  in  terms  of  innermost 
intervals. 

We  say  that  an  interval  .4  is  an  innermost  interval  of  the  Ji’s  if  A  is  a  nonempty  finite 
intersection  of  one  or  more  of  the  Ji’s  such  that  either  Ji  D  A  =  0  or  Ji  D  A  =  A  for  each 
i.  Suppose  there  are  a  total  of  m  distinct  innermost  intervals  Ai  =  (^i,»7i],  where  rji  <  ^i+i 
and  m  <n.  Then  the  likelihood  function  (3)  is  equivalently  given  by 

i=l  k>li  k>ri 

where  li  =  sup{j  :  rjj  <  Li},  Vi  =  sup{J  :  rjj  <  Ri}  and  s  =  (si, ...,  Sm)  denote  the  vector 
of  the  probability  weights.  The  log  likelihood  of  (5,  b)  is 

£(3,!.)  -  (Yi  0*)''“).  (5) 

i=l  k>li  k>ri 

Note  that  iJ2k>ri  =  1  if  ri  =  0  and  =  0  if  li  =  m. 

C.2.  Generedized  maximum  likelihood  estimation. 

A  GMLE  of  {So,P)  is  a  value  of  (s,  6)  that  maximizes  the  likelihood  function  (5).  We 
could  follow  the  NR  algorithm  outlined  by  Finkelstein  [3].  However,  this  would  involve  the 
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inverse  of  a  matrix  of  order  {m  +  p  —  1)  x  {m  +  p  —  1).  Since  m  can  be  potentially  large 
when  n  is  large,  the  unmodified  NR  algorithm  is  not  feasible  for  a  large  data  set.  The  major 
serious  problem  of  the  unmodified  NR  algorithm,  however,  is  that  it  will  almost  always 
diverge  in  one  iteration  from  any  reasonable  starting  value.  The  divergence  is  due  to  the 
fact  that  the  first  Newton  step  will  hit  the  boundary  of  the  parameter  space.  Such  a  serious 
computational  problem  was  not  anticipated  at  the  time  we  submitted  the  DOD  grant. 

To  reduce  computation  burden  due  to  dimensional  effects,  we  propose  to  group  the  orig¬ 
inal  data  {Li,  Ri)  and  then  apply  a  two-step  modified  NR  algorithm  to  obtain  the  two-step 
estimators  (TSE)  of  {So,  0)  based  on  the  innermost  intervals  corresponding  to  the  grouped 
intervals.  In  the  first  year  of  om  research,  we  obtained  a  first  version  of  a  two-step  modified 
NR  algorithm  to  find  TSE.  We  also  carried  out  simulation  studies  to  investigate  sensitiv¬ 
ity  of  estimated  values  of  TSE  to  partition  sizes.  This  version  of  two-step  computational 
scheme,  however,  can  diverge  with  some  simulated  IC  regression  data.  We  reported  om: 
initial  findings  in  a  preliminary  manuscript  in  Wong  and  Yu  [7].  In  our  second  year  of 
research,  we  presented  an  improved  version  of  algorithm  at  the  Era  of  Hope  2002  DOD 
Breast  Cancer  Research  Program  Meeting  in  Orlando  (see  Wong  and  Yu  [8]). 

In  our  third  and  fourth  years  of  research,  we  continued  to  improve  and  update  the 
two-step  algorithm.  We  have  tested  the  latest  version  of  the  algorithm  on  many  simulated 
data  sets  and  have  not  encountered  any  convergence  difficulty. 

In  our  second  year  of  research,  we  applied  our  two-step  estimation  procedure  to  the  Cox 
regression  analysis  of  a  long-term  prognostic  relapse  follow-up  study  involving  375  women 
with  unilateral  T1-2N0,  T1-2N1  and  T3-4  breast  cancer.  All  the  patients  were  treated  at 
Memorial  Sloan  Kettering  Cancer  Center  and  the  follow-up  are  being  conducted  at  Strang 
Cancer  Prevention  Center.  The  main  objective  of  the  study  is  to  assess  the  prognostic  sig¬ 
nificance  of  bone  marrow  micrometastasis  (BMM)  in  predicting  relapse.  Standard  clinical 
variables  including  nodal  status  and  tumor  diameter  were  included  in  the  Cox  model.  Al¬ 
though  we  have  not  completely  established  asymptotic  normality  to  validate  the  P  values 
that  were  reported  for  the  study,  our  two-step  Cox  regression  analysis  gave  strong  indication 
that  BMM  was  not  as  predictive  of  relapse  as  previously  expected  (Osborne  and  Wong  [1]). 
In  our  second  year  of  research,  therefore,  we  have  moved  ahead  of  our  statement  of  work 
by  making  a  start  for  Task  8.  Since  the  BMM  relapse  follow-up  study  provides  a  complete 
and  final  data  set  that  optimally  satisfies  our  need  of  an  empirical  example  to  illustrate  our 
asymptotic  GML  procedure  for  Cox  regression,  we  have  chosen  to  focus  on  this  data  set 
instead  of  the  examples  mentioned  in  Task  8. 
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Also,  in  the  second  year  of  our  research,  we  established  consistency  of  the  GMLE  of 
P  and  So  (and  hence  S'('|2;))  under  the  following  assumptions: 

ASl:  So  is  arbitrary  and  each  of  the  censoring  variables, Fi,  takes  on  finitely  many 

values. 

AS2:  So  is  arbitrary  and  each  of  the  censoring  variables, Fi,  Ffc  is  continuous  and  some 

regularity  conditions  are  imposed  on  either  So  or  the  joint  distribution  function  G  of  K,  Fi, 
Yj(. 

Specifically,  under  ASl  and  AS2 

Pr{  lim  =  1,  (6) 

n—^oo 

and 

Pr{  hm  sup  \Soit)  -  S^o(i)|  =  0}  =  1,  (7) 

ten 

where  H  denotes  the  support  set  of  Yi, ...,  Yk-  Note  that  So{t)  is  guaranteed  to  be  consistent 
for  t  E  H,  and  not  elsewhere.  However,  the  set  H  is  not  necessarily  a  time  interval  (for 
instance,  H  may  be  a  collection  of  discrete  points).  In  order  for  the  consistency  results  to  be 
more  useful,  we  established  that  if  So  is  continuous,  and  the  support  of  Fi,  ...,  Yk  is  dense 
in  [0,  T]  for  some  T  >  0,  then  So{t)  is  consistent  for  all  t  G  [0,r].  The  practical  implication 
of  the  denseness  requirement  is  that  pointwise  consistency  of  So(t)  would  hold  only  if  all 
the  subjects  in  a  follow-up  study  must  be  followed  at  very  frequent  close  intervals. 

We  also  established  similar  consistency  results  for  the  TSE,  with  an  added  assumption 
that  the  maximal  length  of  the  partition  interval  tends  to  0  as  n  tends  to  oo.  These  results 
are  summarized  in  Wong  and  Yu  [7]. 

Asymptotic  normality  is  the  most  crucial  aspect  of  our  research  because  it  is  needed  in 
making  confidence  statements  and  in  performing  hypothesis  testing.  In  the  third  year  of 
our  research,  we  investigated  asymptotic  normality  under  assumptions 
AS3.  So  is  arbitrary  and  satisfies  a  monotonicity  condition,  and  each  of  F/c,i,  —  ?  Yk,k  takes 
on  finitely  many  values; 

AS4.  iSo  is  as  in  ASS,  and  each  of  Yk,\,  •••,  Yk,k  takes  on  countably  many  values; 

ASS.  So  is  as  in  ASS,  each  of  Yk,x,  Yk,k  is  continuous  and  some  regularity  conditions  are 
imposed  on  either  So  or  G. 

Asymptotic  normality  of  GMLE  or  TSE  is  straightforward  to  establish  under  the  finite 
assumption  ASS.  As  for  AS4  and  ASS,  we  carried  out  extensive  simulation  studies  to  guide 
omr  research.  The  studies  suggest  that  both  GMLE  and  TSE  of  ^  and  So  are  asymptotically 
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normal  under  AS4.  However,  only  GMLE  and  TSE  of  can  be  asymptotically  normal  under 
AS5.  We  have  just  completed  theoretical  proofs  to  substantiate  our  numerical  studies.  Our 
simulation  studies  suggest  that  under  ASS  asymptotic  inference  for  GMLE  and  TSE  of 
and  hence  S'(-|2;)  will  have  to  be  accomplished  via  a  bootstrap  method.  In  our  fourth  and 
final  year  of  research,  we  have  investigated  a  bootstrap  approach  for  asymptotic  interval 
estimation  of  So  when  G  is  continuous.  We  have  completed  a  manuscript  that  summarizes 
our  asymptotic  normality  findings  (see  Yu  and  Wong  [9]). 


Log-Log  Difference  Plot  For  BMM 


Figure  1. 

Cox  regression  is  appropriate  only  if  proportional  hazards  (PH)  assumption  is  satisfied 
by  the  data.  Under  the  PH  assumption,  the  log-rank  test  is  most  powerful.  At  present, 
a  statistically  useful  diagnostic  plot  for  PH  assumption  is  lacking.  Moreover,  a  formal 
significant  test  is  not  available.  In  the  third  year  of  our  research,  we  provided  statistical 
solutions  to  satisfy  both  these  needs.  For  the  diagnostic  plot,  we  proposed  to  plot  IniSi(t)  — 
lnS2{t)  verse  t  for  any  two  groups,  where  S  refers  to  GMLE  of  S  under  interval  censoring. 
A  horizontal  line  should  be  expected  if  PH  assumption  holds.  For  a  test  for  PH  assumption, 
we  proposed  an  asymptotic  chi-square  test.  In  our  fourth  and  final  year  of  research,  we 
have  completed  a  manuscript  to  report  on  our  diagnostic  solutions  (see  Wong  and  Yu  [10]). 
We  applied  the  diagnostic  procedure  to  the  BMM  data.  Figure  1  is  the  log  difference  plot 
for  BMM+  and  BMM-  groups  It  is  clear  that  PH  assumption  was  inappropriate  for  the 
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BMM  data.  The  asymptotic  chi-square  test  gave  a  P-value  of  0.013  indicating  a  significant 
departure  from  PH  assumption. 


BMM  Status 


Figure  2. 

Since  Cox  model  is  not  appropriate  for  the  BMM  relapse  follow-up  study,  we  have  little 
choice  but  to  broaden  our  research  on  regression  models  for  IC  data  by  including  semi- 
parametric  models  that  do  not  require  PH  assumption  to  hold.  In  the  second  year  of  our 
research,  we  proposed  to  fit  the  BMM  data  using  a  linear  regression  model  with  an  unknown 
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nonpaxametric  distribution  function  for  the  error  term.  We  successfully  derived  an  non¬ 
iterative  algorithm  to  obtain  the  GMLE  of  the  regression  coefficient  in  case  of  a  simple  linear 
regression  model.  When  we  applied  such  a  model  to  the  BMM  data  in  a  imivariate  analysis, 
BMM  was  significant  at  P  <  0.05.  Incidentally,  BMM  was  also  significant  by  a  weighted 
Kaplan-Meier  statistics  (see  Pepe  and  Smith  [11])  which  does  not  require  PH  assumption. 
Figure  2  gives  the  IC-version  of  the  Kaplan-Meier  plots  of  the  two  BMM  groups.  This 
example  clearly  demonstrates  that  inappropriate  application  of  the  Cox  regression  model 
can  potentially  lead  to  an  errorneous  statistical  conclusion.  We  published  a  statistical  paper 
on  the  theory  of  the  linear  regression  approach  in  the  third  year  of  our  research  (see  Yu  and 
Wong  [12])  and  gave  a  presentation  of  the  novel  analysis  of  the  BMM  data  in  the  fourth 
and  final  year  of  our  research  (see  Wong,  Osborne  and  Yu  [13]). 

D.  KEY  RESEARCH  ACCOMPLISHMENTS 

•  We  have  implemented  a  statistical  algorithm  for  computing  GMLE  of  the  regression 
coefficients  and  the  baseline  survival  function  So- 

•  We  have  implemented  a  statistical  algorithm  for  computing  TSE  of  ^  and  So- 

•  Computer  programs  for  both  GMLE  and  TSE  calculations  have  been  made  available 
to  the  public  via  the  internet. 

•  We  have  proved  consistency  of  GMLE  and  TSE  of  ^  and  So  under  both  discrete  and 
continuous  assumptions  about  the  censoring  distribution  G. 

•  We  have  performed  extensive  simulation  studies  to  investigate  the  asymptotic  properties 
of  GMLE  and  TSE  of  /3  and  So-  Our  results  have  provided  strong  evidence  that  So  is 
NOT  asymptotic  normal  when  G  is  continuous. 

•  We  have  derived  the  asymptotic  normal  means  and  covariance  matrices  of  GMLE  and 
TSE  of  p. 

•  When  G  is  finite  or  countably  infinite,  we  have  derived  the  asymptotic  means  and 
covariance  matrices  of  GMLE  and  TSE  of  So- 

•  We  have  proposed  a  diagnostic  plot  for  checking  proportional  hazards  assumption  for 
Cox  regression  and  constructed  a  chi-square  test  for  assessing  this  assumption. 

•  We  have  completed  regression  analysis  of  a  long-term  breast  cancer  follow-up  study  as¬ 
sessing  the  prognostic  significance  of  bone  marrow  micrometastasis  in  predicting  relapse 
in  a  cohort  of  375  women,  using  asymptotic  GML  Cox  regression,  weighted  Kaplan- 
Meier  statistic,  semi-parametric  linear  regression  methods. 
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E.  REPORTABLE  OUTCOMES 

•  One  oral  presentation  of  an  abstract  at  2002  ASCO  Meeting  ([!]). 

•  Two  poster  presentations  at  2002  Era  of  Hope  DOD  Breast  Cancer  Research  Program 
Meeting  and  2004  Annual  San  Antonio  Breast  Cancer  Symposium.  ([8], [13]). 

•  Three  published  abstracts  on  the  proposed  two-stage  modified  Newton-Raphson  algo¬ 
rithm  and  applications  of  the  methodology  to  a  breast  cancer  relapse  follow-up  study. 

([1],[8],[13]). 

•  A  manuscript  on  computation  of  GMLE  and  TSE  of  Cox  regression  parameters  ([7]). 

•  A  manuscript  on  consistency  and  asymptotic  normality  of  GMLE  and  TSE  ([9]). 

•  One  published  statistical  paper  on  regression  analysis  of  IC  data  ([12]). 

•  A  manuscript  on  assessing  the  appropriateness  of  proportional  hazards  assumption  for 
Cox  regression  ([10]). 

•  Computer  programs  for  calculating  GMLE  and  TSE  made  available  for  the  public  via 
the  internet  site  http://vrww.math.binghamton.edu/qyu/index.html. 

F.  CONCLUSIONS 

In  the  four  years  of  our  DOD  grant,  we  have  successfully  accomplished  omr  research 
objectives  in  developing  asymptotic  generalized  maximum  likelihood  inference  of  Cox  pro¬ 
portional  hazards  regression  model  with  IC  data.  We  have  developed  statistical  algorithms 
that  can  efl&ciently  compute  GMLE  and  TSE  of  the  regression  coefficients  and  the  baseline 
survival  function  So  for  any  reasonable  sample  size.  We  have  proved  consistency  of  GMLE 
and  TSE  of  ^  and  So  under  both  discrete  and  continuous  assumptions  about  the  interval 
censoring  distribution  G.  We  have  established  asymptotic  normality  for  GMLE  and  TSE  of 
P  for  G  imrestricted.  When  G  is  continuous,  however,  we  have  numerically  demonstrated 
that  GMLE  and  TSE  of  So  are  not  asymptotically  normal.  In  the  fourth  and  final  year 
of  our  DOD  grant,  we  have  investigated  a  bootstrap  method  for  the  asymptotic  interval 
estimates  of  So- 

Cox  regression  is  appropriate  only  if  proportional  hazards  (PH)  assumption  is  satisfied 
by  the  data.  We  have  proposed  a  useful  diagnostic  plot  for  PH  assumption  and  validated  a 
chi-square  test  for  it. 

In  our  fourth  and  final  year  of  research,  we  have  completed  the  final  version  of  a 
computer  software  for  asymptotic  confidence  intervals  and  hypothesis  testing  for  GMLE 
and  TSE  of  P  and  <S^(’|^).  We  have  also  completed  the  data  analysis  of  the  BMM  prognostic 
study. 
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The  results  which  we  have  established  will  be  useful  to  clinicians  analyzing  breast  camcer 
relapse  follow-up  prognostic  studies,  breast  cancer  researchers  pursuing  chemoprevention 
intervention  trials  involving  surrogate  endpoints  biomaxkers,  and  genetic  epidemiologists 
conducting  studies  on  familial  aggregation  of  breast  cancer  and  related  cancers. 
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Breast  cancer  bone  marrow  micrometastases:  a  long-term  prognostic  study  of 
systemic  tumor  cell  burden  on  relapse.  M,  P.  Osborne,  G,  Wong;  Cornell  Univ 
Medal  College,  New  York,  NY 

The  presence  of  bone  marrow  micrometastases  (BMM)  detected  at  the  time 
of  initial  surgery  has  been  shown  to  predict  relapse-free  survival  (RFS)  by 
different  investigators.  We  have  conducted  a  long-term  (BMM)  study 
Involving  375  women  with  unilateral  T1-2N0  (56%),  T1-2N1  (42%),  and 
T3-4  (2%)  breast  cancer.  BMM  was  determined  using  monoclonal  antibod¬ 
ies  to  cytokertatin.  Median  follow-up  was  8  years  (range  1  month  -15 
years).  BMM  was  detected  in  124  (35%)  patients.  Logistic  regression 
analysis  did  not  show  any  correlation  between  BMM  prevalence  and 
standard  prognostic  indicators,  including  lymph  node  status  and  tumor 
diameter.  The  number  of  BMM  cells  detected,  representing  the  systemic 
tumor  cell  burden  (TCB),  was  also  examine  for  its  relationship  with 
standard  prognostic  variables  using  lognormal  regression  analysis.  No 
significant  correlation  was  established.  Recent  methodology  of  interval- 
censored  survival  analysis  showed  that  BMM  prevalence  does  not  predict 
relapse.  At  e  median  follow-up  of  2.5  years,  TCB  was  a  significant 
univariate  predictor  of  early  relapse.  In  a  multivariate  analysis,  lymph  node 
positive  patients  with  high  TCB  had  a  significantly  shorter  RFS  than  all 
other  patients.  However,  at  a  median  follow-up  of  8  years,  the  prognostic 
significant  of  TCB  in  node  positive  patients  was  reduced.  Nevertheless,  the 
combination  of  a  positive  nodal  status  and  a  high  TCB  still  Identifies  a 
prognostically  poor  subgroup  with  a  5-year  RFS  around  50%.  There  were 
only  22  (6%)  such  patients  in  our  study,  mosf  of  whom  had  received 
adjuvant  therapy.  This  group  may  reflect  early  ^age  IV  disease;  therefore, 
this  group  deserves  close  attention  in  a  larger  study  to  verify  the  poor 
prognosis  we  have  observed,  as  well  as  to  evaluate  new  treatment  protocols 
to  improve  RFS. 


230  General  Poster.  Sat,  1:00  PM -5:00  PM 

Docetaxel  +  epirubicin  and  docetaxel  +  doxorubicin  are  effective  and^welt 
tolerated  first-line  treatments  for  metastatic  breast  cancer.  7?.  C.  F,  l/^nard, 
K.  M..  Malinovszky,  P.  J.  Barrett-Lee,  A.  Howell,  S.  P.  Johnston  J^uthwest 
Wales  Cancer  Inst,  Swansea  Wales,  UK;  Velindre  Hospital /Cardiff,  UK; 
Chrispe  Hospital,  Manchester,  UK;  Royal  Mersden  Hospitat/London,  UK 

Background  and  objectives;  Docetaxel  (Taxotere®)  +  apfnracyclines  have 
been  shown  to  be  efficacious  and  tolerable  first-iin^chemotherapies  for 
metastatic  breast  cancer.  A  large  phase  III  study,  306,  showed  that 
docetaxel  +  doxorubicin  is  more  effective  than  th^ld  standard  doxorubi¬ 
cin  -h  cyclophosphamide.  The  aim  of  this  on-gcpng  prospective  study  is  to 
evaluate  the  efficacy  and  safety  of  docetaxel  j^oxorublcin  or  epirubicin  in 
a  UK,  community-based,  real-life  setting  aon  to  compare  the  results  with 
those  obtained  from  TAX  306.  This  is  an  wterim  report.  Methods:  Patients 
received  docetaxel  75  mg/m^  and  eithejp^oxorubicin  50  mg/m^  or  epirubi- 
cin  75  mg/m^  D1  q3  weeks,  at  the  trying  clinician's  discretion.  To  date, 
225  patients  (WHO  performance  st^s  0-2)  have  been  enrolled,  of  which 
79  received  doxorubicin  and  14^^irublcln.  The  recommended  anthracy- 
cline  dose  was  administered  in of  cycles  for  doxorubicin  and  90%  for 
epirubicin.  All  patients  were^luated  for  safety  and  158  patients  (70%), 
who  received  s2  chemothe^y  cycles,  were  evaluated  for  tumor  response. 
Results:  Ninety-three  (5^)  patients  had  a  response  (partial  or  complete) 
to  either  regimen;  ht^ver,  there  were  no  significant  differences  in 
response  between  th^fwo  treatment  groups  (doxorubicin  61%,  epirubicin 
58%;  .p=0.71).  The  response  rate  is  similar  to  that  achieved  with 
docetaxel-doxorut^n  in  TAX  306  (59%).  Neutropenia  was  the  most 
common  advers^vent.  with  91  patients  (40%)  requiring  hospitalization; 
64  (28%)  wIth/neutropenia  and  38  (17%)  with  febrile  neutropenia  or 
neutropenic  ^psis  (some  patients  had  both).  Overall,  there  was  no 
signiflcant^erence  In  frequency  of  hospitalization  between  the  two  arms 
(doxorubl^  53%,  epirubicjn  41%;  p=0,08).  There  was  one  death  from 
neutropi^ic  sepsis  (0.4%)  in  the  epirubicin  arm.  Conclusion;  The  interim 
results^om  this  real-life  study  confirm  previous  findings  from  a  large  phase 
III  s^dy:  docetaxel  +  doxorubicin  or  epirubicin  are  effective  and  well 
toj^ated  first-line  treatments  for  metastatic  breast  cancer.  Both  regimens 
exhibited  similar  efficacy  and  safety. 
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Reduction  of  metastases  in  breast  cancer  -  fiatients  treated  wiw^reppe 
hormone  replacement  therapy  (HRT);  a  retrospective  analysipnn  972  wi 

F.  Schuetz,  I.  J.  Dief,  T.von  Holst,  U,  Haus,  G,  Baskfi;  Univers 
Jleidelberg,  Dept  Gynaecology  and  Obstetrics,  H^elberg,  Gen 
IMEREM,  Nuernberg,  Germany  / 

Substitution  of  estrogenes  and  progestins  is  the  raf^st  common  therap 
prophylaxis  for  postmenopausal  discomforts  lijf/hot  flushes,  osteopo 
etc.  However,  in  the  majority  of  studies  long  tdrm  HRT  has  been  assoc 
with  an  slightly  increased  risk  of  breast  c^er.  On  the  other  hand  pal 
with  preoperative  HRT  have  a  lower  mojiality  and  a  longer  overall-sur 
For  further  investigation  we  examin^ 972  patients  between  45  an 
years  at  the  time  of  the  first  diagn^is  of  breast  cancer  with  and  wl 
HRT  with  regard  to  the  incidenc^f  bone  metastases.  241  patients 
premenopausal  (mean  48^3.by),  731  were  postmenop; 
(55.5±4.4y),  303  of  them t^elved  HRT  (group  HRT+)  and  428  pal 
not  (group  HRT-).  PatienjB  of  group  HRT+  received  estrogenes  o 
minimum  of  1  year  {rym  5.5 ±4.^).  Although  the  tumor  size  of  { 
HRT- was  significan^igher  than  in  group  HRT+  (5.5±1.8  vs.  2.1± 
nodal  status,  S-p^i^  fraction,  grading  and  hormone-receptor  $ 
showed  no  differ^ces:  Adjuvant  treatment  in  the  postmenopausal  gi 
were  also  not  sj^ificantly  different  In  regard  to  the  incidence  of  met 
ses  patients  without  HRT  have  significantly  (p<0.001)  more  bone  met 
ses  (49  pa)llents  of  group  HRT-  versus  5  patients  of  group  HRT-l^). 
pujmonaJ^18:2)  and  liver  (28:6)  metastases  were  significantly 
frequei^in  patients  without  an  preoperative  HRT.  It  was  shown  m  v/h 
in  ciin/al  bisphosphonate  trials  that  a  normalization  of  bone  metabolK 
able  j6  reduce  subsequent  bone  rnetastases  efficiently.  We  may  asi 
thatine  incidence  of  bone  metastases  can  be  reduced  normalizing 
melilbolism  (soil)  and  levering  conditions  of  tumor  cell  seeding  by  HR 
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Weekly  docetaxel  with  or  wHhout  ci^costoroid  premedicatiM  as  fpR 
second-line  treatment  In  patients  (pts)  with  metastatiabreasti^ncor  XM^ 

^emrnler,  W.  Mair,  Af.  Stauch,  J.  Papke,  G.  Deutsch,  Yf  Abenhmd 
Dorn,  C.  Kentenich,  C,Jackisch,  S.  Leinung,  O.  Brudler,jf.  Vehling-Ki 
J,  Stamp,  M.  Malekmohammadi,  V.  Heinemann;  Medial  D^t  III,  jjil 
sity  of  Munich,  Munich,  Germany;  Oncologic  Practidf,  Munich,  (Je/ra 
Deaconess  Hospital,  Karlsruhe,  Germany;  Acadynic  Teaching  Ho^ 
Aschaffenburg,  Germany;  Dept  of  Gynecoh^  University  of  M^ 
Munich,  Germany;  Dept  of  Gynecology,  Uni^kity  of  Munster,  MUih 
Germany;  Aventis  Pharma  Deutschland,  Ba^oden,  Germany  .  . 

Objective:  This  large  phase  M  study  wyT  designed  to  evaluate  it) 
efficacy  of  a  weekly  schedule  of  docetaifel  as  first  or  second-line  the 
and  (2)  toxicity  with  or  without  corticc«jmroid  premedication.  Methods: 
pts  (median  age  58,  range  37-80^ith  MBC  were  included  in  thet 
Docetaxel  was  given  at  weekly  do^of  35  mg/m^  x6,  followed  by  a 
rest.  Additbnal  cycles  with  3  wefAs  of  treatment  and  2  weeks  of  resl^ 
adrhinistered  until  disease  proyession.  The  first  34  pts  were  randomliit 
receive  dexamethasone  8  ylg  prior  to  docetaxel  or  no  premedi^ 
Results:  To  date,  1 10  pts  ttife-line  16,  second-line  94)  were  evaluabk 
toxicity,  all  had  measuraWfe  disease  and  ECOG  p^rmance  status:s^ 
99  pts  who  received  >64loses  of  docetaxel  were  evaluable  for  respoN? 
total  of  1367  doses yf  docetaxel  were  given  (median  10,  range  1-] 
Response  (first/second-line):  CR  1/9  pts  (7.1%/10.6%),  PR  4/23 
(28.6%/32.9%),  SO  4/28  pts  (28.6%/32.9%)  and  PD  5/20  pts  (35.1 
23.6%).  Overall/sponse  rate  was  35.7%/43.5%.  Median  time  to  prog 
sion  was  6.6/y  months  and  median  survival  was  14.2  months  (iw* 
reached  for  tifet-line  pts).  Hematologic  toxicity  was  usually  mild 
moderate,  vmh  no  difference  between  the  two  premedication  ^ 
Non-hemat/ogic  toxicity  in  percent  of  pts  {+1-  steroids)  included;  Gf» 
and  11  ptfural  effusions  l.l%/5.9%,  edema  2.2%/5.9%,  lacriiDd 
10.8%/W.6%,  epistaxis  9.7%/17.6%,  nail  changes  12.9%/35.3%. Tfl 
ment  delayed  due  to  neutropenia  in  99  cycles  (7.2%),  and  omitte* 
67  cy^es  (4.9%).  Dose  reductions  of  level  1/2  (-5/-10  m^m^/dosejn 
required  in  10/5  pts  (9%/4.5%).  Conclusions:  The  results  of  this  ^ 
confirm  that  a  weekly  schedule  of  docetaxel  35  mg/m^  is  efficient  and® 
Corticosteroid  premedication  as  generally  recommended  is  mandatory- 
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Abstract:  We  consider  the  estimation  problem  under  the  Cox  proportional  hazards  model  with  interval  - 

censored  data.  Under  this  model,  the  survival  function  S  at  time  x  given  the  covariate  z  satisfies  S'(a;|.e)  = 

(5o(x))®  where  5o  is  a  baseline  survival  function.  0  and  So  axe  estimated  by  the  generalized  maviTmim 
likelihood  estimator  (GMLE).  The  Newton-Raphson  (NR)  method  and  the  profile  likelihood  (PL)  method  for 
obtaining  the  GMLE  do  not  work  most  of  the  time  in  our  simulation  study  and  our  cancer  research  data,  as 
the  maximum  value  of  the  likelihood  is  achieved  outside  the  parameter  space  and  the  GMLE  is  achieved  on  the 
boundary  of  the  parameter  space  in  these  cases.  We  propose  a  different  algorithm  to  compute  the  GMLE.  The 
algorithm  is  able  to  search  for  the  GMLE  along  the  boundary  as  well  as  within  the  parameter  space.  We  also 
propose  to  group  the  data  to  reduce  the  dimension  of  the  parameter  space.  Simulation  results  suggest  that  the 
estimator  is  consistent.  We  apply  our  method  to  the  cancer  cosmesis  study  and  to  another  cancer  research  data. 

1.  Introduction  We  consider  the  estimation  problem  under  the  Cox  proportional  hazards  model  (Cox,  1972) 
with  interval  -censored  data. 

Interval-censored  data  are  encoimtered  in  many  areas  of  the  medical  research.  For  instance,  in  rliniral  cancer 
relapse  follow-up  studies,  the  study  endpoint  is  disease-free  survival.  When  a  patient  relapses,  it  is  usually  known 
that  the  relapse  takes  place  between  two  follow-up  visits,  and  the  exact  time  to  relapse  is  unknown.  Let  X  denote 
a  time-to-event  variable  with  distribution  F{x)  =  Pr{X  <  x),  or  equivalently,  survival  function  5(x)  =  1  -  F{x). 
Then  X  is  not  observed  and  is  only  known  to  lie  in  an  observable  interval  {L,R].  A  standard  method  is  to  use 
the  generalized  maximum  likelihood  estimator  (GMLE)  to  estimate  S. 

The  Cox  proportioned  hazards  model  specifies  that  covariates  have  a  proportional  effect  on  the  hazard  function 
of  the  failure  time  distribution,  namely,  the  survival  function  at  X  =  x  given  Z  =  z  can  be  represented  by 

5(x|z)  =  [5o(x)]®‘^  (1.1) 

where  zS  =  z'/3,  i-e.,  the  dot  product  of  two  vectors,  5o(x)  is  a  baseline  survival  function  and  is  a  p  dimen¬ 
sional  regression  coefiicient  vector.  This  model  provides  powerful  means  for  fitting  failure  time  observations  to  a 
distribution  free  model  and  for  estimating  the  risk  for  failure  associated  with  a  vector  of  covariates. 

Finkelstein  (1986)  applied  the  Cox  model  to  the  analysis  of  interval-censored  data.  Huang  (1996)  studied 
the  asymptotic  properties  of  the  GMLE  of  the  regression  parameters  in  the  Cox  model  with  current  status  data, 
which  is  a  special  case  of  interval-censored  data.  There  is  no  explicit  expression  for  the  GMLE,  thus  one  has  to 
use  numerical  methods.  Finkelstein  suggested  to  use  the  Newton-Raphson  (NR)  method  to  compute  the  GMLE. 

^  Partially  supported  by  Army  Grant  DAMD17-99-1-9390  and  DAMD17-00-1-0448. 
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Huang  suggested  to  use  a  profile  likelihood  (PL)  approach.  However,  both  authors  did  not  really  compute  any 
estimates  under  Cox’s  model  in  their  paper,  except  imder  the  restricted  model  that  ^  =  0.  In  fact,  in  our 
simulation  studies  these  methods  do  not  work  most  of  the  time.  Also  these  methods  do  not  work  in  the  data  set 
used  in  Fmkelstein  (1986).  We  shall  illustrate  the  reasons  via  a  simple  data  example.  To-date,  how  to  find  the 
GMLE  in  the  Cox  regression  model  remains  unsettled. 

There  are  three  computational  problems  in  the  Cox  regression  model  approach  with  interval-censored  data: 

(a)  a  value  of  the  likelihood  function  at  a  non-monotone  estimate  of  So  can  be  greater  than  that  at  the  GMLE; 

(b)  the  GMLE  may  occur  on  the  boimdary  of  the  parameter  space; 

(c)  there  are  often  too  many  parameters  to  be  estimated. 

If  case  (b)  holds,  one  may  apply  the  NR  method  to  the  boimdary  of  the  parameter  space.  However,  this 
approach  is  not  feasible  if  n  is  moderate  (see  Appendix  2). 

In  this  paper,  we  propose  a  different  algorithm  to  deal  with  Troubles  (a)  and  (b).  The  main  idea  is  to  allow 
searching  the  GMLE  along  the  boundary,  as  well  as  within  the  parameter  space,  dynamically.  We  also  propose  to 
group  the  data  in  a  certain  manner  to  reduce  the  dimension  of  the  parameter  space.  In  a  cancer  research  data  set, 
the  number  of  parameters  is  reduced  ftom  59  to  8.  The  paper  is  organized  as  follows.  In  Section  2,  we  introduce 
notations  and  the  model.  In  Section  3,  we  introduce  our  procedure.  In  Section  4,  we  apply  the  method  to  two 
cancer  research  data.  In  Section  5,  we  present  simulation  results.  The  results  suggest  that  the  new  proposed 
procediure  is  feasible  and  the  estimates  of  parameters  and  their  variances  are  consistent.  Some  tedious  calculation 
is  put  in  Appendix  1.  In  Appendix  2,  we  present  a  simple  example  to  illustrate  that  the  NR  method,  a  scaled 
NR  method  and  a  PL  method  do  not  work. 

2*  The  Cox  regression  model  and  notations.  We  shall  first  describe  the  model.  Let  Yk^i  <  <  •  *  *  < 

Yk,k  denote  the  follow-up  times  for  a  patient  who  made  K  follow-up  visits,  in  a  longitudinal  follow-up  study. 
Since  the  number  of  visits  for  each  patient  may  vary,  K  is  s.  positive  random  integer.  For  convenience,  define 
Yk^o  =  0  and  Yk^k-\-i  =  oo-  The  time-to-event  variable  of  interest,  X,  is  not  directly  observed;  instead,  it  is 
known  to  lie  in  between  two  successive  censoring  time  points  (TjcjjTjrj+i),  where  j  =0,  ...,  K,  Note  that  X  is 
left  censored  if  j  =  0,  strictly  interval  censored  if  0  <  j  <  FT,  and  right  censored  i£  X  >  Yk,k-  The  observable 
interval-censored  data  corresponding  to  X  is  given  by 

(L,  R)  =  {YK,uYK,i^{)  if  Yk.i  <X<  YK.i^u  i  =  0, 1, K,  (2.1) 

In  addition  to  (L,  R),  we  also  observe  a  p  x  1  covariate  vector  Z.  We  assume  that  K  and  Yk  i’s  are  independent 
oi{X,Z), 

Let  (Li^Ri^Zi)^  i  =  1,  ...,  n,  be  a  random  sample  of  interval-censored  observations  with  covariates.  The 
likelihood  function  is 

L  =  n((5(Li))*"‘  -  (2.2) 

i-1 

where  S  is  a  survival  function,  and  6  is  a  p  x  1  dimensional  vector.  The  GMLE  of  (5o,  13)  is  a.  value  of  (5,  b)  that 
maximizes  the  likelihood  function  L  in  (2.2)  over  all  possible  survival  functions  S  and  b  E  TV, 

To  compute  the  GMLE,  we  shall  introduce  some  notations.  We  say  a  set  is  a  finite  intersection  of  observed 
intervals  /^’s  (see  (2.1))  with  end-points  Li  and  Ri  if  the  set  is  an  intersection  of  one  or  more  observed  intervals. 
We  say  an  interval  A  is  an  innermost  interval  if  it  is  a  nonempty  finite  intersection  of  the  observed  intervals  such 
that  for  each  observed  interval  /|,  1%  and  A  are  either  disjoint  or  nested.  Suppose  there  are  totally  m  innermost 
intervals.  Let  ^i  and  7^*,  i  =  1, 2, . . . ,  m  denote  the  left  and  right  end-points  of  the  ith  innermost  intervals,  where 
Vi  ^  For  convenience,  define  t]q  =  — oo. 

It  follows  fi:om  Peto  (1973)  or  Turnbull  (1976)  that  the  GMLE  of  S  places  all  probability  weights  on  the 
innermost  intervals.  Thus  it  suffices  to  maximizes 

L  =  fliiSuY"*  -  iSr.r"*)  otC  = 

i-1  i=l 

where  Si  =  S(7]i),  U  —  sup{j  :  rjj  <  Li,j  >  0}  and  =  sup{j  :  r^j  <  Ri,j  >  0},  One  can  construct  a  numerical 
example  that  —C  is  not  convex  and  has  multiple  stationary  points,  even  subject  to  the  monotone  condition. 
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3.  Methods.  We  shall  introduce  the  new  procedure  in  this  section. 

3.1.  Grouping.  Grouping  can  reduce  the  dimension  of  the  parameter  space.  We  group  the  original  data  as 
follows.  Partition  the  whole  range  of  data  points  into  p  subintervals.  For  example,  group  the  data  in  the  unit 
of  month,  half-year  or  year.  Let  <  •  •  •  <  tp  be  the  partition  points.  Denote  data  after  grouping  by 

where  =  tj  if  tj  <  Li  <  Lj^i  and  =  Rk  if  Rj^i  <  Ri<  Rj,  to  =  -oo  and  tp+i  =  oo. 

After  grouping,  compute  the  GMLE  using  the  method  in  §3.2  based  on  the  grouped  data  (L*,  B*)’s.  We  shall 
assume  that  the  data  hereafter  have  been  grouped  and  abusing  notation,  we  denote  them  by  (Li,ili)’s  instead. 

3.2.  A  feasible  algorithm  for  the  GMLE.  Abusing  notations,  we  identify  5  with  a  vector  (5i,...,5m). 

Similarly,  we  identify  5^*)  with  ). 

Step  0.  Let  =  0  be  the  initial  estimate  of  13  and  the  GMLE  of  a  survival  function  with  observations  (Lj,  jRj), 
j  —  1,  ...,  n  be  the  initial  estimate  of 

Step  z  +  1  (i  >  0).  Let  6^*^  and  be  the  updated  values  of  b  and  S  at  Step  i.  Do  fe-step  and  S-step  as  follows. 

*  (fe-step)  With  S  =  5^*)  fixed,  find  a  fe  so  that  the  likelihood  function  increases.  Denote  the  up¬ 

dated  estimate  fe  by  fe^*"*"^).  In  particular,  one  can  use  the  NR  method  to  obtain  the  maximum  point  fe  of  the 
likelihood  function  with  the  given  S  — 

*  (5-step)  With  fe  =  fe^*"*"^)  fixed,  search  a  non-increasing  5  so  that  the  likelihood  function  £(*,fe(*"*‘^^)  is 

maximized  (or  increases).  Denote  the  up-dated  estimate  5  by  In  order  to  guarantee  the  up-dated  So 

is  nondecreasing,  proceed  as  follows.  Let  At  Sub-step  j  {j  —  1,  ...,  m),  update  (5i,...,5,yi) 


by  where  and  = 


+W 


l+ti 


1+u 


number  maximizing  where  5.,„  = 

Note:  If  such  Uo  is  difficult  to  choose,  one  may  choose  a  satisfying 


if  h<j, 

^h>j, 


h  =  l, m,  tio  >  0  is  a 


(3.1) 


In  particular,  if  |rhiL(6(‘+i),5.,„)|^^o  >  0,  Uo  =  c*'^lnI(6('+i),5.,„)|^^o,  where  5..„  =  (5i, and 
k  is  the  smallest  non-negative  integer  that  is  smaller  than  Ko  such  that  Inequality  (3.1)  holds. 

Stop  at  convergence. 

Expression  of  the  partial  derivatives  can  be  foimd  in  Appendix  1. 

Remark.  Let  pi  be  the  weight  on  the  innermost  interval  (^i,»?i],  p  =  (pi,  ...,Pm)  and  p^*^  the  updated  value  of 
p  at  the  ith  step.  Since  5(7?^)  =  pi^i  -}-•••  +Pm^  the  5-step  can  also  be  replace  by  the  p-step  as  follows. 

(p-step)  With  fe  =  fe(*+^)  fixed,  search  a  non-increasing  5  so  that  the  likelihood  function  £(*,fe^*+^))  is 


maximized  (or  increases). 


{P\ 


..pm^^^’^),  where  p^* 


Let  =  p(*).  At  Sub-step  j  {j  =  1, 


,(»+!)  _ 


Pjy^C 


and  =  .{ 

I  14-ti 


m),  update  (pi,...,Pm)  by 


if  h  =  i, 


if  h  ^  j, 

number  maximizing  L{b^*\S.,u)  where  5.,„  =  and  Si,n=Pi+i,u  +  ■ 

Moreover,  the  restriction  u  >  0  can  be  replaced  by  «  >  — 


h  =  l,...,m,  Uo  >  0  is  a 


+  Pn 


3.3.  Estimation  of  the  covariance  matrix.  It  is  well  known  that  the  GMLE  of  a  survival  function  based  on 
interval-censored  observation  (Li,  iii)’s  may  not  be  strictly  decreasing  on  the  set  {rji  :  i  =  0, 1,  Thus  the 

GMLE  S  of  So  may  satisfies  S{rii)  =  S{i]i+i)  for  some  i.  In  the  latter  case,  according  to  our  simulation  results,  it 
is  often  that  the  empirical  Fisher  information  matrix  is  ill-conditioned  or  singular,  unless  we  modify  C  as  follows. 

Delete  the  innermost  intervals  at  which  the  GMLE  of  So  assigns  no  weight.  Let  (^*,J?*],  i  =  1,  m*,  be 

the  remmning  innermost  intervals.  Modify  C  as 


r=5]iog{[5,;r“‘-[5.;r"‘}, 

t=i 

where  I*  =  sup{j  :  Tfj  <  L*}  and  r*  =  s\xp{j  :  rij  <  Then  use  the  inverse  of  the  empirical  Fisher  information 
matrix  corresponding  to  this  modification  as  the  estimate  of  the  covariance  matrix. 

This  modification  works  well  in  our  applications  and  simulations. 
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4.  Application. 

Example  4.1.  We  apply  oxir  procedure  to  a  breast  cancer  relapse  follow-up  study  based  on  data  obtained  from 
375  women  with  stages  I  -  III  unilateral  invasive  breast  cancer  surgically  treated  at  Memorial  Sloan-Kettering 
Cancer  Center  between  1985  and  2001.  The  median  follow-up  duration  was  7.4  years.  Relapse  time  was  given 
by  the  time  interval  between  surgery  and  the  initial  relapse.  A  relapse  that  took  place  between  two  successive 
follow-up  visits  was  regarded  as  interval  censored.  If  a  patient  did  not  relapse  toward  the  end  of  the  study, 
then  her  relapse  time  was  right  censored.  Of  the  375  observations,  288  were  right  censored  (no  relapse),  20 
were  left  censored  and  67  were  strictly  interval  censored  (87  relapses).  The  tmnor  diameter  (i.e.,  the  diameter 
of  the  tumor  which  was  removed  in  surgery),  the  number  of  lymph  nodes  with  metastisis,  and  bone  marrow 
micrometastasis  (BMM)  were  determined  for  each  woman  at  the  time  of  surgery.  Denote  the  three  covariates 
by  Ui,  U2  and  ZJa,  respectively.  The  first  variable,  (tumor  size),  is  discretized  as  =  l(Xi>3);  tbe  second 
variable,  (number  of  lymph  nodes),  is  discretized  as  Z2  =  the  BMM  variable  is  discretized  as 


if  the  number  of  metastasis  bone  marrow  cells  >  14, 
otherwise. 


Table  1.  Results  on  estimating  0  with  cancer  data. 


data 

01 

02 

^3 

original 
grouped  in  months 
grouped  in  half-years 
grouped  in  years 

0.580  (0.229) 
0.576  (0.229) 
0.577  (0.229) 
0.569  (0.229) 

0.886  (0.242) 
0.888  (0.242) 
0.884  (0.242) 
0.889  (0.242) 

0.344  (0.293) 
0.338  (0.293) 
0.337  (0.293) 
0.328  (0.293) 

The  NR  and  PL  method  do  not  work  for  this  data  set  due  the  phenomenon  (a).  We  use  the  proposed 
procedure  to  obtain  the  6MLE.  The  GMLE’s  of  the  regression  coefficients  based  on  the  original  data  and  the 
grouped  data  axe  presented  in  Table  1.  We  only  present  the  GMLE  of  the  regression  coefficients. 

The  number  of  parameters  to  be  estimated  is  56  in  the  original  data.  There  are  54  innermost  intervals  with 
21  of  them  having  positive  weights.  We  abo  grouped  data  by  months,  half-years  and  years.  The  numbers  of 
parameters  are  16, 12  and  8,  respectively.  Thus  properly  grouping  indeed  reduces  the  dimension  of  the  parameter 
space.  The  GMLE’s  of  )3i’s  and  their  standard  errors  are  presented  in  Table  1  too.  SE’s  in  Table  1  were  computed 
by  the  procedure  introduced  in  §3.3. 

Example  4.2.  (Breast  cancer  cosmesis  study).  The  data  set  can  be  found  in  Finkelstein  and  Wolfe  (1985). 
We  refer  the  reader  to  that  paper  for  a  complete  description  of  the  study.  Finkelstein  (1986)  applies  the  Cox 
regression  model  to  compare  the  patients  who  received  adjuvant  chemotheraphy  to  those  who  ffid  not  in  this 
study.  So  there  is  one  covariate,  the  indicator  that  the  patient  received  adjuvant  chemotheraphy.  There  are  94 
patients  in  the  study.  There  are  30  innermost  intervals.  That  is,  m  =  30  and  p  =  1.  Thus  there  are  29  parameters 
related  to  the  imderline  survival  function  and  one  parameter  related  to  covariate  in  the  Cox  model.  The  GMLE 
of  ^  is  0.80  with  a  standard  error  0.29.  A  GMLE  of  So  is  a  step  function  taking  jumps  at  13  points  as  given 
below. 

/  tj  5  7  8  12  17  19  20  25  31  34  39  48  N 

\S:  0.97  0.96  0.92  0.87  0.83  0.81  0.70  0.65  0.58  0.57  0.43  0.27  ) 

Note  we  only  give  12  points  above,  as  there  are  two  points  that  are  very  close.  One  can  check  that  case  (a)  is 
true  for  this  data  set,  and  the  NR  or  PL  method  does  not  work. 

5.  Simulation.  In  order  to  assess  the  asymptotic  properties  of  the  GMLE,  we  carried  out  simulation  studies. 
Hereafter  denote  Exp{fi,cr)  a  distribution  with  the  pdf  f{x)  =  The  underlying  distributions 

are  as  follows:  X  has  an  exponential  distribution  Exp{5, 5).  The  covariate  Z  =  (Zi,  Z2,  Zs)',,  where  Zi,  Z2  and 
Z3  are  i.i.d.  from  a  discrete  distribution  with  pdf  f(i)  =  =,? — :,  i  =  1, ...,  6.  (i,  R)  is  generated  by  the  following 

2^,=i  J 

scheme: 

f  (0,£f)  j£X<U, 

(L,R)  =  ^  (20,oo)  ifA:>20, 

[iu  +  kV,U+{k  +  l)V)  if  A’<20,feF<  A:-£/<(*  +  l)V'andA:>l. 

where  f/  ~  17(0, 2)  and  V  ~  17(0, 2.3). 
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We  carried  out  simulation  with  sample  sizes  50,  200  and  400,  and  with  1000  replications  for  each  case.  The 
sample  means  and  the  standard  errors  (SE)  are  presented  in  Tables  2,  3  and  4.  In  grouping  data,  we  tried  lengths 
of  intervals  3,  5,  and  8. 


Table  2.  Grouping  effect  on  estimating  0  when  n  =  200. 


grouping 

width 

Pi 

P2 

Pz 

true  value 

-0.1 

0.2 

0 

GMLE 

-0.091  (0.032) 

0.194  (0.034) 

3 

GMLE 

-0.100  (0.041) 

0.211  (0.044) 

5 

GMLE 

-0.105  (0.046) 

0.214  (0.048) 

-0.104  (0.047) 

8 

GMLE 

-0.109  (0.055) 

0.213  (0.057) 

-0.109  (0.053) 

Table  3.  Simulation  results  on  convergence. 


Pi 

02 

0z 

n  =  50 
n  =  200 
n  =  400 
true  value 

-0.123  (0.142) 
-0.109  (0.055) 
-0.105  (0.037) 

-0.1 

0.238  (0.127) 

0.213  (0.057) 

0.207  (0.039) 

0.2 

-0.092  (0.129) 
-0.109  (0.053) 
-0.107  (0.038) 

-0.1 

Table  4.  Simulation  results  on  estimating  of  variances  when  n  =  400 


Pi 

~pi 

SE  of  Pi 

SE  of 

width  =8 

n=400 

-0.1 

-0.105 

0.037 

0.037 

0.002 

0.2 

0.207 

0.039 

0.038 

0.002 

-0.1 

-0.107 

0.038 

0.037 

0.002 

width  =5 

n=400 

-0.105610 

0.033031 

0.033080 

0.001190 

0.206348 

0.034651 

0.034211 

0.001282 

-0.105921 

0.033598 

0.033039 

0.001098 

width  =8 

n=200 

-0.108608 

0.055201 

0.042939 

0.003206 

0.213256 

0.056661 

0.047957 

0.004768 

-0.108610 

0.053448 

0.042963 

0.003164 

width  =5 

n=200 

width  =3 

n=200 

width  =0 

n=200 

width  =8 

n=50 

Table  2  indicates  that  the  GMLE  does  not  change  much  after  grouping,  though  the  SE  increases,  as  expected. 
Table  3  presents  simulation  results  based  on  grouping  intervals  of  length  8.  The  table  suggests  that  the  GMLE 
of  the  regression  coefficients  based  on  grouped  data  are  consistent.  Table  4  present  the  mean  of  Pi,  the  SE  of  0i, 
the  mean  of  the  estimates  of  the  SE  of  and  the  SE  of  the  estimates  of  the  SE  of  Pi.  Thble  4  suggests  that 
estimates  of  the  variances  of  the  GMLE’s  match  the  sample  variances  quite  well. 


Appendix  1. 

We  derive  the  partial  derivatives  needed  in  §3  here.  Given  the  right  boundary  points  of  the  IM’s,  say, 
to  =  —oo  <ti  <  •••  <tm  —  oo,  denote  Si  =  S{ti),  Si^  =  S{Li)  and  =  S{Ri),  i.e.,  li  (resp.  ri)  represents  the 
index  j  such  that  Sj  =  S(Lj)  (resp.  Ri).  Then 


rin(l-(5,,r^) 


if  Zi  =  0, 

if  Tj  =  m, 

if  0  <  Zi  <  Tj  <  m. 


5 


ln(5fcj  ifO</K<m, 
0  if  Ai  =  0. 


2z,t\r.2  Bi{k)-Bi{ri) 


ainL  ^  fc  f  Bo{li)-Bo{ri)  if  ^  m 

A,  where  A  =  <  H  r,  m 

Iln5,.  if  r<  =  m, 

and  Boihi)  =  |  "f  0  < 

^  ^  lO  if  Ai  = 

_1.  ,  c^ztb\p2  Biik)  -  Bi{ri) 

where  Bilhi)  =  (  if  0  <  Aj  <  m, 

lo  ifAi  =  0. 

Write 

ifA>J.  1^  +  1  ifA>j. 


{Sur*'’  -  {Sr,r^^ 


Write 


Moreover,  Afe(O)  =  Sj, 


Afe(«) 


BUjkju)  __  f~a+i^8  if*<j, 

l-lfep  ifA>i. 

d^Ujkju)  _  f  2  if  A  <  i, 

12^  if*>i. 

^Afe(O)  f  if  A  <  i  and  0  <  A  <  m, 

^ —  =  s  1  -  5j-  if  A  >  j  and  0  <  A  <  m, 

0  otherwise. 

d^Uik(O)  if  A  <  j  and  0  <  A  <  r 

—  “  1  2(5j-  -  1)  if  A  >  j  and  0  <  A  <  m, 
^  0  otherwise. 


Abusing  notations,  write  Sj  =  Sj(u)  =  Ujk{u). 


dlnL. 

du 


Appendix  2 

We  use  a  simple  numerical  example  to  illustrate  why  the  various  existing  algorithms  do  not  work  for  the 
GMLE. 

§1.  Consider  fitting  Cox’s  regression  model  with  five  observations  {Li,Ri,Zi):  (2,5,0),  (3,4,0),  (5,9,1),  (1,6,1), 
(7,8,0).  It  can  be  viewed  as  data  from  two  groups,  corresponding  to  =  0  or  1.  Then,  the  innermost  intervals 
are  (3,4),  (5,6)  and  (7,8).  Let  the  weights  on  these  innermost  intervals  be  pi,  p2  and  ps,  with  pi  H-p2  +P3  =  1  and 
Pi  >  0.  Note  that  the  baseline  survival  frmction  5  satisfies  5(4-)  =  1,  5(4)  =  5(6-)  =  P2+P3,  5(6)  =  5(8-)  =  P3 
and  5(8)  =  0.  For  this  example,  it  is  more  convenient  to  express  the  likelihood  as  a  frmction  of  Pi’s  rather  than 

5.  The  likelihood  is  L  =  PiP3(l  —  P3  )(P2  +  PzY  •  Since  pi  +  P2  +  P3  =  1?  in  view  of  L,  it  is  simpler  to  write  the 
log  likelihood  as 

I  =  log[p?P3(l  -  pi)«®  (1  -  pf)].  (^.1) 

The  parameter  space  is  fl  =  {(0,pi,p3) :  0  €  (-oo,oo),pi  >  O.ps  >  0,p  +  l  +  ps  <  1}  with P2  =  1  -pi  -ps.  For 
convenience,  we  write  a  =  hereafter.  Thus, 

I  =  2 logpi  +  logp3  +  a log(l  -  pi)  +  log(l  -  p^). 

Since  the  likelihood  function  has  only  three  variables,  it  can  be  shown  by  direct  derivation  that  the  GMLE  of 
(B,Pi>P2,P3)  is  approximately  (-0.461,2/3,0,1/3). 
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In  general,  the  likelihood  is  not  so  simple  and  one  needs  to  compute  the  GMLE  by  numerical  methods. 
We  shall  illustrate  by  this  example  that  several  naive  numerical  methods  fail  to  yield  the  GMLE.  They  include: 
(a)  the  Newton-Raphson  (NR)  method;  (b)  the  scaled  NR  method  and  (c)  the  profile  likelihood  (PL)  method. 
Finally,  we  shall  illustrate  by  this  example  why  our  new  algorithm  can  yield  the  GMLE.  The  main  difference  is 
that  the  first  three  algorithms  cannot  search  the  GMLE  along  the  line  p2  =  0  (or  pi  +  ps  =  1),  while  the  new 
algorithm  can.  Note  that  the  GMLE  is  on  boundary  p2  =  0. 

§2,  In  order  to  apply  the  NR  method,  we  need  to  compute  the  partial  derivatives. 


£=l0g(l-w)- 

dl  2  g 

dpi  Pi  I- Pi 


Pz  IogP3 
1-p?  ’  da^ 

—  = 
dpz  ~ 


■Pz 

and 


Ps 


PsClogPs)^ 

(1-P?p  ’ 

{A.2) 

I-P?' 

{A.Z) 

§2.1.  (The  NR  method).  At  the  GMLE  (pi,P3)  =  (2/3, 1/3)  with  ^  =  —0.461,  Equation  (A.3)  yields  that 
the  gradient  in  (pupz)  is  (1.11, 1.11).  In  other  words,  as  (pi,P2)  moves  towards  outside  the  parameter  space,  the 
likelihood  increases.  Thus  the  maximinn  value  of  L  without  the  restriction  of  the  parameter  space  can  only  be 
achieved  outside  the  parameter  space.  The  NR  yields  the  unrestricted  maximum  point  of  L.  Thus  the  solution 
to  the  NR  algorithm  is  not  the  GMLE. 

§2.2.  A  scaled  NR  method  is  as  follows. 

Let  =  0  or  a  =  1  he  the  initial  value,  and  let  the  GMLE  (or  SCE)  of  (puPs)  at  ^  :=0  be  the  initial  value 
ofipuPz)- 

Step  1.  Maximize  L  over  0  with  given  up-dated  (puPs)  using  the  NR  method. 

Step  2,  Maximize  L  over  (pi,P3)  with  up-dated  jS  using  a  scaled  NR  method,  that  is,  scale  the  increments 
Api  in  the  original  NR  algorithm  by  a  constant  c  so  that  the  updated  (pi,P3)  remains  in  the  parameter  space. 

Repeat  Steps  1  and  2  until  convergence. 

However,  it  does  not  work  in  this  example.  In  particular,  in  the  initial  step,  we  have  =  0  (or  a  =  1)  and 
(PijPs)  =  (3/5, 2/5).  In  Step  1,  L  is  maximized  by  a  =  0.76  (see  Eq.  (A.4)).  In  Step  2,  by  Equation 

(A.3),  the  gradient  at  (puPs)  =  (3/5, 2/5)  is  (1.44, 0.61).  Thus  {pupz)  shotild  be  up-dated  to  (|-hl.44a:,  |+0.61») 
for  some  x  >  0.  If  a;  >  0,  it  violates  the  constraint  pi  +P3  <  1.  Thus  the  algorithm  stops  at  5(4)  =  p2  +P3  =  2/5 
and  5(6)  =  p^  =  2/5  with  ^  =  logO.76  (=  —0.274),  which  is  not  the  GMLE. 

§2.3.  A  PL  approach  is  as  follows: 

The  initial  step  and  Step  1  are  the  same  as  in  the  scaled  NR  method  above. 

Step  2  (pi-substep).  Maximize  L  overpi  with  up-dated  p^  and  /3. 

Step  3  (pz-substep).  Maximize  L  overpz  with  up-dated  pi  and 

Repeat  Steps  1,  2  and  3  until  convergence. 

However,  the  PL  method  still  does  not  work.  In  particular,  at  Step  1,  a  =  0.76,  pi  =  0.6  and  pz  =  0.4.  The 
gradient  at  (pupz)  =  (0.6, 0.4)  is  (1.44,0.61).  Thus  we  move  (pi,P3)  either  to  (0.6  +  1.44a:, 0.4)  with  a:  >  0  (pi 
substep),  or  to  (0.6, 0.4  +  0.61a:)  with  x  >Q  [pz  substep).  If  x  >  0,  both  the  pi-substep  and  the  p3-subtep  wiU 
move  [puPz)  outside  the  parameter  space.  Consequently,  it  will  stop  at  the  value  which  is  not  the  GMLE. 

§2.4.  There  are  three  line  segments  in  the  boundary  of  the  parameter  space  in  (pi,P3).  They  are  pi  =  0, 
P3  =  0  and  Pi  +P3  =  1.  One  can  find  the  value  that  maximizes  the  likelihood  on  these  line  segments  separately, 
using  the  NR  method,  and  they  check  which  is  the  GMLE.  This  approach  works  in  this  example.  However,  if 
there  are  m  p^’s,  we  need  to  consider  the  subsets  of  the  boundary  corresponding  to  one  pi  —  0,  two  p*  =  0,  ..., 
m  —  2  Pi  =  0.  Thus  the  order  is  0(m’”/^).  When  m  is  large,  this  approach  is  not  feasible. 

§3.  We  now  illustrate  why  the  new  algorithm  works.  Our  new  algorithm  is  as  follows. 

The  initial  step.  Let  the  GMLE  of  {pi,p2^pz)  be  the  initial  value  of  the  (pi,P2,P3)  and  a  =  l  the  initial  value 

of  a. 

/3-step.  Maximize  L  over  P  with  up-dated pi^s, 

S-step.  Each  S-step  consists  of  3  substeps:  pi-substep,  p^-substep,  pz-substep. 

pi-substep.  Consider  a  transformation  pii(«)  =  2^,  Piz{u)  =  and  pi3(u)  =  u  >  0.  This 
transformation  ensures  that  (piiiu),pi2{u),piz{u))  remains  in  the  parameter  space  of  (pi,p2,pz)  for  each  u>  0. 
Let  Uo  be  the  value  of  u  that  maximizes  L{P,pii{u),pi2{u),piz{u))  over  u  >  0,  with  ^  and  pi’s  given  in  the 
previous  step.  Then  up-date  pi  by  pi  =  pu{uo),  i  =  1,  2,  S. 
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P2-substep.  Consider  another  transformation  P2i{u)  =  P22{y)  =  P2z{y)  =  // 

^lnL(^,P2i(w),j>23(w))|^^Q  >  0,  choose  auo  >  0  that  maximizes  L{0^p2i{u)^p22{u),p23{y))  over  u>0,  vnth  0 
and  Pi  ^s  given  in  the  previous  step.  Up-date  Pi  by  pi  =  P2i(«o);  i  =  1,  S,  5. 

ps-substep.  Consider  a  farther  new  transformation  p3i{u)  =  p32{u)  =  and  pz3{u)  =  If 

^lnL(^,P3i(w),P33(w))|^_Q  >  0,  choose  a  Uo  >  0  that  maximizes  ^(i8,P3i(w),P32MjP33(w))  over  u  >  0,  with  0 
and  Pi 's  given  in  the  previous  step.  Up-date  Pi  by  pi  =  Psi{uo),  i  =  1,  2,  3. 

At  the  pi-substep  of  the  initial  iteration  step,  by  Eq.  (A.5),  ^lnL(^,Pii(w),Pi3(n))|^^Q  =  0.51  >  0  at 
(puPz)  =  (0.6, 0.4),  and  Uo  «  0.1  maximizes  L{0^p2i{u)^p2z{y))*  At  this  step  {puP2>tPz)  is  up-dated  to  (f^,  0, 

(=  (0.636,0.364)).  At  the p2-substep  and ps-substep,  by  Equations  (A.6)  and  (A.7),  ^lnL(^,Pii(w),p<3(it))|^^Q  < 
0,  z  =  2,  3,  thus  no  change  is  made.  However,  since  ipuPz^Pz)  is  changed  at  this  S-step,  0  (or  a)  will  also  be 
change  at  the  next  )9-step. 

In  fact,  in  the  next  ^-step,  0  is  up-dated  to  lnO.69  =  —0.371.  In  the  pi-substep,  ^ln2/(;8,pii(w),pi3(w))|^^Q 
=  0.14  >  0,  L  is  maximized  by  [puPz)  =  (f^^j  0I2)  (0.65,0.35)  with  Uo  =  0.042.  by  Equations  (A.6)  and 

(A.7),  ^lnL()9,p4i(u),pj3(w))|^^Q  <  0,  z  =  2,  3,  thus  no  change  is  made.  However,  since  (pi,P2>P3)  is  changed 
at  this  S-step,  0  (or  a)  will  also  be  change  at  the  next  ^-step. 

Iteratively  repeat  these  two  steps,  the  algorithm  will  yield  the  GMLE  {0,Pi,P2,Pz)  =  (“0.461,2/3,0, 1/3). 
Remark  2.  Recall  that  p^^^  is  the  GMLE  of  pi  when  /3  =  0.  Let  pi  be  the  GMLE  under  Cox’s  model.  According 

to  our  observation,  it  is  often  the  case  that  if  p^^^  =  0  then  p^  =  0  too.  It  is  not  clear  that  whether  it  is  indeed 
true  that 

p|®^  =  0  iff  Pi  =  0. 

If  this  is  true,  then  one  can  delete  the  Pi’s  for  which  pj^^  =  0  in  the  algorithm  to  reduce  the  dimension  of  the 
parameter  space.  Moreover,  after  this  elimination,  the  NR  method  will  work  too,  since  the  GMLE  is  in  the 
interior  of  the  parameter  space.  However,  both  the  sufficient  and  the  necessary  conditions  may  not  hold. 

§4.  The  following  is  the  details  of  deriving  the  GMLE  directly.  There  are  only  3  variables  in  L,  by  direct 
examination,  one  can  find  that  the  maximum  value  of  L  is  outside  the  parameter  space  and  the  GMLE  of 
(puP2,Pz)  is  on  the  boundary  of  the  parameter  space.  Moreover,  the  GMLE  of  (pi,P2,P3)  is  on  the  subspace 
P2  =  0,  as  L  =  0  if  Pi  =  0  or  p3  =  0.  If  P2  =  0, 1  =  (1-P3)^pj‘'’“(l  -p?)-  £  =  logps  -  =  logps^E^  =  0 

implies  that  unless  ps  =  1,  we  have  P3  =  1/2  or  p3  =  2““  or 


a  =  -log2/logp3. 


(A4) 


For  each  fixed  p3,  if  a  =  — log2/logp3,  L  achieves  its  maximum  (1  —  P3)^P3”^^^^'^^^®^^(1  —  p3)”^o82/iogp3 
GMLE  can  be  found  by  plotting  the  graph  of  (p3,L). 

§5.  In  this  section,  we  shall  derive  the  partial  derivatives  needed  in  §3. 


d  1-pi  d  .  .  -P2  d  ,  .  -P3 

(n-u)2’  (i  +  «)2’  (i  +  u)2- 


du 

d 


^lnL(^,pii(«),pi3(w))|^^o  = 

d 


^  2(1  -  Pi)  apt 
Pi  1-P? 


—  a  —  1 


^  ( \  Pi 

d 


f  1  1-P2  9  .  -P3 

P22(u)  =  7T~7..\2»  ^P23(u)  = 


(i+«)2’  du* 


(l  +  n)2 


Q^^W,P2l{u),P23iu))\^^0  =  -2  -  1 
du 


Pi 


Pi 

l-pz 


d  .  .  -pi  d  ,  .  -P2  d  ,  . 

P3l(«)  ^  -  (1  +  „)2  >  -  (l  +  u)2* 


AlnL(^.P3i(n),p33(n))|„^,  =  -2  +  +  a 


(A5) 


(A.6) 


(A.7) 
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Interval-censored  data  (IC)  arise  naturally  in  breast  cancer  follow-up  studies  in  which  the 
exact  value  of  a  time-to-event  variable  X  cannot  be  observed  but  is  known  to  lie  in  a  time 
interval  usually  defined  by  two  successive  follow-up  time  points.  Examples  of  such  an  X 
variable  include  time  to  relapse  and  time  for  the  value  of  a  biomarker  to  reach  a  target  value 
in  a  breast  cancer  prevention  trial. 

This  research  proposal  is  concerned  with  survival  analysis  of  X  in  the  presence  of  p 
covariates  denoted  by  the  vector  Z.  Let  S(x|z)=Pr(X>x|Z=z)  denote  the  survival  function  of 

X  at  Z=z.  Cox  regression  model  for  S(x|z)  is  given  by  S(x|z)=[So(x)]  ^  ,  where  So  is  a 
baseline  survival  function,  and  □  is  a  pxl  vector  of  regression  coefficients.  Generalized 
maximum  likelihood  estimates  (GMLE)  of  □  and  So  have  to  be  obtained  iteratively.  The 
usual  Newton-Raphson  (NR)  algorithm  will  most  of  the  time  not  work  owing  to 
dimensional  constraint  and  an  inherent  tendency  towards  convergence  to  local  maxima  in 
the  IC  situation.  We  propose  a  two-step  NP  algorithm  to  overcome  these  problems.  At  the 
first  step,  □  is  updated  with  the  usual  NR  algorithm.  This  step  should  present  little 
computational  difficulty  because  the  dimension  p  is  usually  manageable.  Let  xi  <  A  <  Xm 
denote  the  right-end  points  of  the  innermost  intervals  for  the  IC  data.  At  the  second  step  of 
our  propose  algorithm,  the  S(xi)  are  updated  successively  using  univariate  NR  algorithm 
one  component  at  a  time,  and  the  monotonicity  constraint  S(xi)  >  A  >  S(xn,)  is  maintained 
at  every  move. 

When  m  is  large,  we  propose  to  partition  the  data  into  groups  to  reduce  computational 
burden,  Monte  Carlo  simulations  indicate  that  the  GMLE's  are  quite  robust  against  partition 
size. 

We  have  applied  the  two-step  NR  algorithm  successfully  to  a  breast  cancer  follow-up  study 
involving  375  patients  and  with  m=21.  The  purpose  of  the  study  is  to  assess  the  prognostic 
significance  of  bone  marrow  micrometastasis  in  predicting  relapse  and  survival. 

Our  two-step  NR  algorithm  provides  for  the  first  time  a  feasible  computational  algorithm 
that  will  enable  researchers  to  perform  Cox  regression  analysis  of  IC  data  from  breast 
cancer  follow-up  studies  of  reasonable  dimensions. 


The  U.S.  Army  Medical  Research  and  Materiel  Command  under  DAMD 17-00- 1-0448  supported  this  work. 
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Abstract:  The  logrank  test  for  the  two-sample  problem  is  very  popular  in  medical  research  and  is  most 
powerful  tmder  Cox’s  proportional  hazards  model.  There  is  a  diagnostic  plot  for  visualize  whether  data  fit 
Cox’s  model,  but  there  is  no  test  available  for  testing  whether  a  continuous  and  censored  data  set  indeed 
fits  the  model.  We  propose  such  a  test.  We  demonstrate  by  cancer  research  data  that  if  om  test  suggests 
that  Cox’s  model  is  appropriate,  both  the  logrank  test  and  the  test  based  on  the  weighted  Kaplan-Meier 
statistic  give  consistent  results.  If  our  test  suggests  that  Cox’s  model  is  not  appropriate,  the  logrank  test 
suggests  that  the  difference  between  two  survival  functions  is  not  significant,  while  test  based  on  the  weighted 
Kaplan-Meier  statistic  indicates  that  the  difference  is  significant. 

1.  Introduction. 

Cox’s  proportional  hazards  model  has  often  been  used  in  comparison  of  two  survival  functions.  However, 
there  is  no  discussion  in  the  literature  on  how  to  test  whether  the  model  is  appropriate  for  a  continous  and 
censored  data  set.  We  shall  address  this  issue  in  this  paper. 

In  medical  research,  an  important  question  that  is  asked  frequently  is  whether  one  group  of  patients  has 
a  higher  survival  rate  than  another  group  of  patients,  or  whether  a  new  treatment  is  better  than  another 
treatment.  This  is  called  a  two-sample  problem.  Let  Xij,  j  =  1,  ...,  rii,  be  survival  times  of  patients  in 
Group  i,  i  —  1,2.  Xij  has  an  unknown  siuvival  function  Si.  The  null  hypothesis  for  two-sample  problems  is 

Ho:  Si  -  52  {Si{x)  =  S2{x)  for  all  x), 


against  the  alternative  hypothesis 

Hi:  Si  >  S2  {Si{x)  >  S2{x)  for  all  x  yet  5i(x)  >  52 (a:)  for  some  a:). 

Another  alternative  is 

H2:  5i  /  52  (  5i(x)  ^  52(a:)  for  some  a:). 

It  is  often  that  the  value  of  Xij  is  not  exactly  observed,  but  is  only  known  to  lie  within  two  time 
points,  say  Lij  and  Rij.  For  example,  relapse  time  of  a  cancer  patient  is  only  known  to  occur  between  two 
consecutive  follow-up  times,  or  is  right  censored  if  relapse  has  not  taken  place  by  the  last  follow-up  time. 

^  Partially  supported  by  Army  Grant  DAMD 17-00- 1-0448. 
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In  the  latter  case,  one  can  denote  Lij  the  last  foUow-up  time  and  Rij  =  oo.  We  say  such  observations  are 
interval  censored.  In  this  paper,  we  assume  that  Xij  may  subject  to  interval  censoring.  Thus,  one  observes 

Xtj  G  [L^j,Rij]  if  Ltj  Rij,  right  censoring,  the  observations  are  equivalent  to 


Xij  G  {Lij,  Rij]  if  Lij  <  Ri^ 


■j* 


{Lij,  Rij),  where 

{Mij,Sij),  where  Mij  =  Lij  and  Sij  is  the  indicator  function  of  the  event  Rij  =  oo. 

By  letting  Z  be  the  indicator  function  of  the  event  that  the  observation  is  from  Group  1,  one  can  use 
regression  models  to  test  the  two-sample  problem.  In  particular,  if  data  satisfy  Cox’s  proportional  hazards 
model,  namely,  S{t\Z)  =  {So{t))^  ,  where  So  is  a  baseline  survival  function,  the  null  hopthesis  Hq  becomes 

HqI  P  =  0.  Note  that  the  test  becomes  the  logrank  test  when  data  are  right  censored.  The  logrank  test  is 
most  powerful  imder  the  Cox  regression  model  (see  Miller  (1981)). 

In  order  to  utilize  the  Cox  regression  model,  it  is  important  to  test  whether  the  Cox  regression  model 
is  appropriate  for  a  given  data  set.  Since 


log(-log(S'i(a;)))  =  +log(-log(5o(x)) 


and 

log(-log(S2(x)))  =  log(-log(S'o(x)), 

it  is  suggested  in  several  textbooks  (see  Cox  (1994)  and  Lee  (1992))  to  use  the  log-log  siurvival  plot  to  see 
whether  the  Cox  regression  model  is  appropriate.  In  particular,  letting  Si  be  the  generalized  mayiimiTn 
likelihood  estimator  (GMLE)  of  the  survival  function  based  on  observations  from  Group  i  (i  =  1,  2),  it  is 
suggested  to  plot  the  two  curves 

y  =  log(-log(i§i(x)))  and  y  =  log(-log(52(x))) 

on  the  same  graph.  If  the  two  curves  are  roughly  parallel,  then  the  Cox  model  is  appropriate.  However,  there 
is  no  test  available  for  testing  whether  the  two  curves  are  parallel.  We  present  two  examples  (see  Examples 
3.1  and  3.4)  that  even  though  the  data  does  not  fit  Cox’s  model,  it  is  difficult  to  judge  whether  log-log  plots 
“appear”  parallel. 

Cox  (1984,  p.l50)  suggests  a  testing  procedure  for  discrete  data,  with  the  test  statistic  H  =  h2{t)/hi{t), 
where  hi  is  an  estimate  of  the  discrete  hazard  of  Sample  i.  However,  if  the  data  are  continuous,  e.g,,  there 
is  no  tie  among  data,  this  approach  is  not  appropriate. 

In  Section  2,  we  propose  an  improved  diagnostic  plotting  procedure  and  a  test  on  whether  the  two  curves 
are  parallel  with  right-censored  data  or  interval-censored  data.  In  Section  3,  we  apply  the  new  procedures 
to  several  data  sets.  We  demonstrate  the  following  interesting  facts  by  two  cancer  research  data  examples. 

(1)  The  log-log  plotting  cannot  tell  that  the  data  do  not  fit  Cox’s  regression  model. 

(2)  Even  though  it  seems  quite  obvious  that  the  two  survival  functions  are  different,  the  logrank  test  gives 
insignificant  results  in  both  cases. 

(3)  The  test  based  on  the  weighted  Kaplan-Meier  statistic  (see  Pepe  and  Fleming  (1991))  gives  significant 
results  in  both  cases. 

(4)  Our  new  procedure  indicates  that  both  data  do  not  fit  Cox’s  regression  model. 

We  demonstrate  by  another  cancer  research  data  set  that,  when  our  procedure  find  no  evidence  that  Cox’s 
model  is  not  appropriate  for  that  data,  both  the  logrank  test  and  the  test  based  on  the  weighted  Kaplan- 
Meier  statistic  give  significant  results.  We  further  apply  our  new  procedure  to  a  simulation  data  set  from 
Cox’s  regression  model,  and  the  result  is  not  significant,  as  it  should  be. 

2.  Method 

We  shall  propose  a  test  for  Cox’s  model  is  appropriate. 


2.1,  With  right-censored  data 

Under  right  censoring,  the  observations  are  {Mij,  Sij),  j  =  1,  ...,  Ui,  j  =  1,  2.  The  GMLE  of  a  sxnvival 
function  based  on  Sample  i  is 


Siit)=  11(1- 


hi)  N 
rii-j  +  V’ 


2 


where  <  •  ■  •  <  order  statistics  of  Mij  from  Sample  i,  and  is  the  Sik  that  assosiated  with 

By  the  definition  of  Z  and  the  property  of  the  Cox  model, 


log[— log  51(^1  Z)]  —  log['-log52(t|Z)]  =  /?  for  each  t  and  Z,  (2.1) 

In  view  of  this  equality,  we  propose  to  replace  the  log-log  plot  by  plotting 

U  =  U{t)  =  log(-  log(5i(t)))  -  log(-  log(52(t))).  (2.2) 

In  view  of  (2.1),  we  shall  inspect  whether  the  curve  is  some  what  a  horizontal  straight  line,  that  is,  it  is 
within  a  band.  Furhtermore,  in  order  to  test  define  a  set  of  statistics  as  follows.  Let  61,  ...,  6^+1  be  all 
the  distinct  exact  observations  from  the  pooled  sample.  Compute 

Uj  =\og[-\ogSi{bj)]-log[-logS2ibj)],  j  =  l,...,m  +  l.  (2.3) 


Thus,  one  expects  that  Ui,  ...,  i/m+i  are  statistically  a  constant.  Note  that  if  Si{bj)  =  0  or  1,  f/j  =  ±00. 
By  deleting  these  types  of  6^,  without  loss  of  generality,  we  can  assume  that  Ui's  are  all  finite.  Denote 
Qi  =  Ui^i  -Ui,i  =  1,  ...,  m  For  simplicity,  we  further  assume  that  m  is  even  (by  deleting  one  number).  Let 
Q  =  (Qij  Then  Qi  has  mean  zero.  Note  that  the  covariance  matrix  Si  of 

(5i(ti),...,5i(t  mi  mi+l)}  •••?  ^2(^mi+m2)) 

are  known,  where  ti,  ...,*^1  are  exact  observations  in  the  first  group  and  tmi+ij  are  exact  obser¬ 

vations  in  the  second  group.  For  convenience,  denote 


®*-{| 


Si{ti)  if  i  = 

S2{ti)  it  i  =  +m2. 


Note  that  Si(bj)  =  Si  if  <<  <  bj  <  ti+i  and  i  <  mi,  or  if  <<  <  bj  and  i  =  mi;  and  S2{bj)  =  Sf  if  fj  <  bj  <  tf+i 
and  i  <  mi  +m2,  or  if  STOi+m2  <  bj  and  i  =  mi  +  m2.  By  Slutsky’s  theorem,  one  can  compute  the  covariance 
matrix  of  Q,  say,  S.  In  fact,  E  =  ETSiH',  where  H  =  ^  GMLE  of  S.  Denote 

W  =  (W^i,...,W,„)'  =  E-1/2Q.  If  n  is  suflScient  large,  say  n  >  20,  by  the  asymptotic  normality  of 
without  loss  of  generality,  one  can  assume 

A1  Wi,  ...,  Wm  are  a  random  sample  from  a  normal  distribution. 

Let 


T,T>m/2(Wi-U2r 


2  2 

where  wi  =  —  IFi  and  U2  =  —  Wi, 
»=1  i>ml2 


(2.4) 


then  we  expect  that  F  has  an  F  distribution  with  (m/2)  -  1  and  (m/2)  -  1  degrees  of  freedom.  A  test  is  to 
reject  if  F  is  too  large  or  too  small  {V  should  be  around  1  under  H^). 


2.2.  With  interval-censored  data 

We  first  discuss  how  to  obtain  a  GMLE  of  5*.  An  interval-censored  observation  {Li,Ri)  can  be  repre¬ 


sented  by  an  interval  li 


r  ^{{LuRi] 


if  L*  <  '  A  nonempty  intersection  of  some  Ji’s,  say  A,  is  called  an 

innermost  interval  if  A  satisfies  that,  for  each  i,  An  li  equals  0  or  A.  Turnbull  (1974)  shows  that  the  GMLE 
based  on  random  intervals  Ji,  i  =  1,  ...,  n,  only  puts  weights  on  innermost  intervals.  Let  Ai,  ...,  Am  be  all 
the  distinct  innermost  intervals  induced  by  Ji,  ...,  /„.  Let  Sj  be  the  weight  assigned  by  the  GMLE  to  Aj 
and  s  =  (si, ...,  s^).  An  GMLE  of  a  cdf  is 


m 

F(t)  =  %l(AjC(-oo,t])5  where  Ib  is  the  indicator  function  of  an  event  B,  (2.4) 

i=i 


3 


The  GMLE  s  can  be  obtained  by  the  following  self-consistent  algorithm: 


(1) 

(2) 


Let  =  1/m,  j  =  1,  m. 


„(fc) 


For  jfe  >  1,  let  =  -  V  — j  =  i 


m. 


Repeat  Step  2  until  convergence.  The  limit  limfc_yco(5i*^\  — ,Sm^)  is  s. 

For  more  efficient  algorithms  for  the  GMLE,  we  refer  to  Wellner  and  Zhan  (1997).  By  numerical 
methods,  we  can  compute  the  GMLE’s  E,  Fi  and  F2,  and  thus  can  compute  A  (see  (2.1)). 

It  is  well  known  that  for  each  i,  given  interval-censored  observations  (Ltj)  Rij)%  there  is  a  unique  GMLE 
of  Si  such  that  it  is  a  right-continuous  step  function  with  discontinuity  points  only  at  i?ij’s.  Let  Si  be  such 
a  GMLE  of  Si,  t  =  1,  2,  and  let  61,  bm  be  all  the  distinct  finite  points  at  which  5i  or  S2  takes  a  jump. 
We  propose  to  replace  log-log  plot  by  plotting  U  =  U{t)  defined  in  (2.2),  with  new  6i’s  and  new  Si%  and 
reject  if  V  is  too  large  or  too  small,  where  the  test  statistic  V  is  defined  in  (2.4)  with  new  ft^’s  and  new 
5i’s. 


3.  Application 

In  this  section,  we  apply  our  new  procedure  to  three  cancer  research  data  sets  and  one  simulation  data 
set.  Three  are  right-censored  data  and  one  is  interval-censored  data.  We  demonstrate  by  two  cancer  research 
data  examples  (Example  1  and  3)  the  following  interesting  facts: 

(1)  The  log-log  plotting  cannot  tell  that  the  data  do  not  fit  Cox’s  regression  model. 

(2)  Even  though  it  seems  quite  obvious  from  survival  plots  that  the  two  survival  functions  are  diiEerent  and 
The  test  based  on  the  weighted  Kaplan-Meier  statistic  (see  Pepe  and  Fleming  (1991))  suggests  that  the 
difference  is  significant  in  both  cases,  the  logrank  test  suggests  there  is  no  difference  in  both  cases. 

(3)  Our  new  procedme  indicates  that  both  data  do  not  fit  Cox’s  regression  model. 

We  demonstrate  by  Example  2  that  when  our  procedure  find  no  evidence  that  Cox’s  model  is  not 
appropriate  for  that  data,  both  the  logrank  test  and  the  test  based  on  the  weighted  Kaplan-Meier  statistic 
give  significant  results.  Finally,  we  generated  two  samples  of  data  from  Cox’s  regression  model,  our  new  test 
reveals  that  there  is  no  evidence  that  the  data  are  not  from  Cox’s  model,  as  we  expect. 

Example  1.  (the  AML  Maintenance  Study)  (see  Miller  (1980)). 

Fig.  1 .  Survival  PLot  for  Leukemia  Data 
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A  clinical  trial  to  evaluate  the  efficacy  of  maintenance  chemotherapy  for  acute  myelogenous  leukemia 
(AML)  Wets  conducted.  There  are  11  right-censored  data  in  the  maintained  group:  9,  13,  13+,  18,  23,  28+, 
31,  34,  45+,  48,  161+.  There  12  data  in  the  control  group:  5,  5,  8,  8,  12,  16+,  23,  27,  30,  33,  43,  45.  The 
GMLE’s  of  the  survival  functions  of  these  two  groups  are  plotted  in  Figure  1.  One  may  want  to  test  whether 
the  two  distribution  are  the  same.  In  Figure  2,  we  plot  log(— log(5i(t))).  It  seems  that  the  two  curves  are 
parallel.  The  logrank  test  gives  a  P-value  0.0655.  That  is,  the  two  distribution  functions  are  not  significant 
different. 


Fig.  2.  log-log  Survival  PLot  for  Leukemia  Data 


20  30 

time  t  in  weeks 


Pepe  and  Fleming  (1991)  propose  a  test  based  on  the  statistic  [iSi(t)  —  S2(^)]dt,  where  T  is  the  longest 
foUow-up  time.  For  this  data  set,  it  gives  a  P-value  <0.01.  Thus  the  test  suggests  that  the  two  distributions 
are  significantly  different,  which  is  consistent  with  Figure  1. 


We  plot  the  curve  log(— log(iSi(t)))  —  log(— log(52(t)))  in  Figure  3.  The  curve  in  Figure  3  does  not 
appear  to  be  a  band.  Here  Si{t)  is  the  survival  function  of  the  control  group  and  52 (t)  is  the  survival 
function  of  the  maintenances  group.  We  further  compute  statistics  C/j’s  and  V.  For  the  current  data  set, 
Ui^s  equal  oo,  oo,  1.4467262,  1.7316298,  0.9870430,  0.4784492,  0.3912877,  0.6607606,  0.9265938,  0.5493081, 
0.8340220,  0.4937044,  0.8466185,  oo,  oo. 


Note  that  for  this  data  set,  m  =  15,  =  oo  at  i  =  1  or  2  and  I/i  =  0  at  i  =  14  or  15.  Thus  we  delete 

these  four  points.  By  further  deleting  the  middle  point  (so  that  m  =  10),  we  fotmd  that  V  is  extremely  large 
(=  9.1),  with  a  P-value  <  0.025.  If  we  delete  the  13rd  point,  V  =  10.1,  the  P-value  is  also  <  0.025.  Thus  we 
concluded  that  it  is  unlikely  that  the  data  fit  the  Cox  model.  Thus  it  is  not  a  surprise  that  the  logrank  test 
does  not  reject  Hq. 
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Fig.  3.  Diagnostic  PLot  for  Leukemia  Data 
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Example  2.  Survival  data  of  30  patients  with  AML  are  given  in  Lee  (1992,p.257).  Among  them  17  patients 
are  old  (>  50  years)  with  survival  times:  6,  7,  8,  9,  15,  18,  19-I-,  23,  28+,  28+,  31,  39+,  45+.  The  survival 
times  of  the  younger  group  are  2,  3,  3,  3,  4,  4,  8,  8,  9,  10,  12,  13,  13,  14,  18,  26+,  354-.  The  two  GMLE’s  of 
the  survival  functions  are  plotted  in  Figure  4, 

In  Figure  5,  log(—  iog(<Si))  are  plotted,  Lee  claims  that  Figure  5  suggests  that  the  data  fit  the  Cox 
regression  model.  We  use  om  approach  to  justify  that  claim.  Here  5i  (t)  is  the  survival  function  of  the  older 
group  and  S2{t)  is  the  survival  function  of  the  younger  group. 


Fig.  4.  Survival  PLot  for  AML  Data 
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Fig.  5.  log-log  Survival  Plot  for  AML  Data 


We  plot  the  curve  log(—  log(5i(t)))  —  log(—  log(52(i)))  in  Figure  6.  The  curve  in  Figure  6  does  appear 
to  be  a  band. 

V  =  1.00419  with  5  and  5  degrees  of  freedom,  with  a  P-value  >  0.1.  Thus  there  is  no  evidence  that  the 
Cox  Model  is  not  appropriate. 

The  logrank  test  has  a  P-value  0.0249.  This  is  an  example  that  if  the  data  fit  the  Cox  model  then  the 
logrank  test  is  very  powerful. 

Fig.  6.  Diagnostic  PLot  for  AML  Data 


Example  3.  We  generate  two  samples  of  right-censored  data  from  exponentied  distributions,  with  density 
functions  f(x  :  6)  =  x  >  0,  $  =  1,  2.  The  first  sample  has  20  data: 

0.0115,  0.1240,  0.1660,  0.2830+,  0.3286,  0.3603,  0.4047,  0.4586,  0.4599,  0.4762, 

0.8303,  0.8379+,  0.8647,  1.0008,  1.0872+,  1.1345,  1.1740,  1.2917+,  1.6129,  2.7834+, 

The  second  sample  also  has  20  data: 
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0.0089,  0.0622,  0.0722,  0.1307,  0.1345,  0.1751,  0.2390,  0.2707,  0.4198,  0.4489, 
0.4564,  0.5426,  0.6066,  0.7225+,  0.8942,  0.9409,  0.9886,  1.4954,  1.9492,  2.4567. 


The  test  statistic  V  has  a  P-value  >  0.05.  Thus  we  cannot  reject  the  null  hypothesis  which  is  a 
correct  decision  in  this  simulation  study. 


Example  4.  We  applied  the  new  procedure  to  a  standard  breast  cancer  relapse  follow-up  study  based  on 
data  from  375  women  with  stages  I  -  III  unilateral  invasive  breast  cancer  surgically  treated  at  Memorial 
Sloan-Kettering  Cancer  Center  between  1985  and  1990.  The  median  follow-up  duration  was  46  months. 
Relapse  time  was  given  by  the  time  interval  between  surgery  and  the  initial  relapse.  A  relapse  that  took 
place  between  two  successive  follow-up  visits  was  regarded  as  interval  censored.  If  a  patient  did  not  relapse 
toward  the  end  of  the  study,  then  her  relapse  time  was  right  censored.  Of  the  375  observations,  300  were 
right  censored  (no  relapse),  21  were  left  censored  and  54  were  strictly  interval  censored.  Bone  marrow 
micrometastasis  (BMM)  was  determined  for  each  woman  at  the  time  of  surgery.  An  important  question  is 
whether  remission  duration  is  related  to  the  extent  of  initial  tumor  cell  burden  (TCB)  defined  as  number  of 
BMM  cells  detected.  We  grouped  the  patient  according  to  whether  the  patient’s  number  of  BMM  cells  is 
<  14  or  >  14. 
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Figure  7.  Tumor  Cell  Burden 
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The  GMLE’s  of  the  survival  functions  of  these  two  groups  are  plotted  in  Figmre  7.  It  seems  from  the 
figure  that  the  two  survival  functions  are  different.  We  also  given  the  log-log  plot  in  Figure  8.  It  is  hard 
to  say  from  the  figure  that  the  two  curves  axe  not  parallel.  Thus  we  let  Z  be  the  indicator  function  of  the 
event  that  the  observation  is  from  Population  2,  i.e.,  the  number  of  BMM  cells  is  >  14,  and  assume  that  the 
data  fit  Cox’s  regression  model.  It  turns  out  that  the  semi-parametric  MLE  of  j3  is  0.328  with  a  standard 
deviation  0.293.  However,  the  P-value  is  0.131.  That  is,  P  is  not  significantly  different  from  0.  The  Cox 
model  approach  says  that  the  BMM  has  no  effect  on  the  survival  rate.  The  conclusion  is  not  consistent  with 
Figure  1. 


8 


Figure  8.  Log-Log  Plot  For  Tumor  Cell  Burden 
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We  compute  the  test  based  on  the  statistic  /W’(t)(Fi(t)  —  F2{t))dt,  where  W{t)  =  1(o,t)  and  T  is  a 
constant.  The  P-value  is  0.029.  In  our  calculation,  we  chose  T  =  3500,  the  longest  follow-up  time  of  a 
patient.  The  result  indicates  that  the  effect  of  BMM  is  significant.  That  is,  Hq  is  not  true. 

Applying  the  new  approach  to  the  data,  we  found  that  V  is  extremely  large,  with  a  P-value  <  0.01. 
Thus  we  concluded  that  it  is  unlikely  that  the  data  fit  the  Cox  model.  Thus  it  is  not  surprised  why  the  Cox 
regression  analysis  performs  poorly. 

We  further  check  whether  the  grouping  in  BMM  and  BMM-  would  fit  the  PH  model.  The  P-value  is 
0.013  and  the  diagonostic  plots  are  give  in  Figure  9. 
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ABSTRACT 

Consider  the  model  Y  =  pX  -f-  e  with  interval-censored  data,  where  e  has  an  unknown  cdf  Fq.  The  semi- 
parametric  MLE  (SMLE)  of  is  well  defined,  but  cannot  be  obtained  by  algorithms  for  M-estimators,  or 
by  the  Newton-Raphson  method  or  the  Monte-Carlo  method.  Thus  it  has  not  been  studied  in  the  literature 
even  in  the  case  of  complete  data.  We  propose  a  feasible  algorithm  to  obtain  all  solutions  of  the  SMLE. 
Simulation  suggests  that  the  SMLE  is  consistent  and  the  bootstrap  estimator  of  the  variance  of  the  SMLE 
matches  the  sample  variance.  We  compare  the  SMLE  to  the  Buckley-James  estimator  (BJE)  in  four  data 
sets  with  sample  sizes  up  to  374.  The  results  show  that  the  SMLE  is  more  robust  and  more  reliable  than 
the  BJE. 


1.  INTRODUCTION 

Regression  analysis  is  one  of  the  most  widely  used  statistical  techniques.  Its  applications  occur  in  almost 
every  field,  including  economics,  engineering,  the  physical  sciences,  management,  life  and  biological  sciences 
and  the  social  sciences.  We  consider  the  simple  linear  regression  problem  with  interval-censored  data.  In 
particular,  we  assume  that 

Al.  Y  =  0X  +  e,  where  Y  is  a  random  variable,  X  a  covariate  taking  at  least  two  values,  /?  an  (unknown) 
regression  coefficient  and  e  a  random  variable  with  an  unknown  c.d.f.  Fq. 

Since  no  further  assumption  is  made  on  {P^Fo),  this  is  a  semi-parametric  problem. 

There  are  several  estimators  for  the  simple  linear  regression  model  with  complete  data.  They  are  the 
least  squares  estimator  (LSE),  Theil’s  estimator  (Theil  (1)),  various  M-estimators  (Huber  (2)),  adaptive 
estimators  (Bickel  (3)),  and  Bayes  estimators. 

Sometimes  Y  may  be  right  censored.  There  have  been  extensive  studies  on  hnear  regression  models 
with  right-censored  data,  see,  e.^.,  Buckley  and  James  (4),  Miller  (5)  and  Ritov  (6),  among  others. 

*  Partially  supported  by  DAMD 17-99-1-9390  and  DAMD 17-00- 1-0448. 
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Interval  censoring  is  another  type  of  censoring.  For  example,  in  medical  follow-up  studies,  a  patient  may 
be  inspected  (or  interviewed)  total  of  K  times  {K  >  1),  at  Ck,i  <  •  •  •  <  Gk,k»  Time  to  event  of  interest,  y, 
is  often  not  observable  but  is  only  known  to  lie  within  two  consecutive  inspection  times  Ckj  and  Ckj-\-u 
where  Ck,o  =  “Oo,  Ck,k+i  =  oo,  and  I  <  j  <  K.  In  such  a  case  we  are  only  able  to  observe  (L,  B,  X), 
where  — oo  <  L  <Y  <  R<  +oo.  We  say  such  an  observation  is  interval  censored.  Hereafter,  we  assume  that 
A2.  the  observable  random  vector  is  (L,iJ,X),  where  (L,  i?)  is  an  extended  random  vector  (e.^.,  R  =  oo 

imder  right  censoring)  such  that  either  L  <Y  <RoxL  —  Y  =  R, 

Odell,  et  at  ((7))  considered  the  parametric  maximum  likelihood  estimation  for  interval  -censored 
data  based  on  a  WeibuU  distribution.  Self  and  Grossman  (8)  considered  the  linear  regression  problem  with 
interval-censored  data  for  a  given  distribution  Fo  with  location-scale  parameter  and  proposed  a  marginal 
likelihood  approach. 

Several  authors  have  proposed  estimators  for  under  hnear  regression  models  with  interval-censored 
data,  assuming  Fo  is  unknown.  Rabinowitz,  et  at  (9)  proposed  a  class  of  score  statistics  to  estimate  /?.  Their 
approach  parallels  the  construction  of  the  Buckley  &  James  estimator  (BJE)  for  right-censored  data.  Li  and 
Pu  (10)  considered  generalizations  of  Miller’s  estimator  and  the  BJE  for  interval-censored  data  that  contain 
exact  observations.  Zhang  and  Li  (11)  and  Li  and  Zhang  (12),  among  others,  studied  M-estimators  with 
doubly-censored  data  and  Case  1  interval-censored  data.  These  approaches  can  be  viewed  as  a  modification 
of  the  Semi-parametric  MLE  (SMLE).  But  they  are  not  an  SMLE  (see  Example  1.1  below). 

Under  the  semi-parametric  set-up,  it  is  weU  known  that  the  LSE  is  not  efficient  unless  Fo  is  a  normal 
distribution.  The  BJE  is  an  extension  of  the  LSE  under  censoring  (see  Li  and  Zhang  (12)),  thus  the  BJE 
is  not  efficient.  Consequently,  since  the  BJE  is  also  an  M-estimator  (see  Zhang  and  Li  ((11),  p.  2723)),  an 
M-estimator  may  not  be  efficient.  Li  and  Zhang  proposed  efficient  M-estimators  for  Case  1  interval-censored 
data.  However,  it  is  not  clear  how  to  obtain  an  efficient  estimator  for  arbitrary  interval-censored  data. 

The  SMLE  of  (^,  Fo)  has  long  been  ignored  and  there  is  no  algorithm  in  the  literature  for  obtaining  the 
SMLE  with  interval-censored  data.  Recently,  Li  and  Zhang  (12)  mentioned  without  a  proof  that  the  SMLE 
(they  called  the  profile  MLE)  should  be  consistent  with  Case  1  interval-censored  data.  However,  there  is 
no  hint  on  how  to  compute  it.  Unlike  the  MLE  in  other  cases,  the  SMLE  cannot  be  computed  by  standard 
numerical  methods,  e.^.,  the  Monte  Carlo  method,  the  Newton-Raphson  method,  the  M-estimation  methods 
and  the  finite  algorithms  discussed  in  Osborne  (13).  For  the  sake  of  simplicity,  we  illustrate  with  4  artificial 
complete  observations  as  follows. 

Example  l.l.  Suppose  the  observations  (Xi,li)’s  are:  (--1,0),  (0,1),  ((1,2)  and  (4,0).  The  generalized 
likelihood  function  (Kiefer  and  Wolfowitz  (14))  is 

i>)  =  nti  /(^i  -  f{t)  =  F{t)  -  F{t-)  and  F  is  a  cdf.  (1.1) 

Then,  given  6,  L  is  maximized  by  the  empirical  cdf  Ft,,  where  A(t)  =  j  5Zi=i  and  is  the 

indicator  function  of  the  event  A,  That  is, 

r(|)4  if  6=1, 

L(F,  6)  <  L(F6,  6)  and  L(F6, 6)  =  I  (|)2(i)2  if  6  =  0,  -1/4  or  -2/3,  (1.2) 

[  (|)^  otherwise. 

Thus  the  SMLE  of  yS  is  1. 

A  key  requirement  in  the  Newton-Raphson  method  and  the  Monte  Carlo  method  (see,  c.^.,  Ingber  (15)), 
as  well  as  in  the  finite  algorithms  discussed  in  Osborne  (13),  is  that  L  is  continuous  at  a  neighborhood  of 
the  maximum  point  or  L  is  convex.  It  follows  from  (1.2)  that  l{b)  =  L(F6,6)  =  ^  a.e.,  which  is  not  the 
maximum  of  L(E,  6),  That  is,  Z(-)  is  neither  continuous  at  the  SMLE  nor  is  convex  in  b.  Consequently,  the 
two  standard  methods,  as  well  as  the  finite  algorithms  discussed  in  Osborne,  do  not  help  to  find  the  SMLE. 

The  M-estimate  considered  by  Zhang  and  Li  (12)  is  a  solution  to  where  F  is  properly 

defined.  Since  ^  =  0  a.e.  by  (1.2),  the  SMLE  is  somewhat  an  M-estimator.  But  this  M-estimation 
approach  is  non-informative,  as  every  value  of  6  is  an  M-estimate.  Theil’s  estimator  is  the  median  of  the 
collection  of  slopes  of  the  line  segments  connecting  (Xi,y<)  and  {Xj,Yj)^  where  1  <  i  <  j  <  n  (=  4).  Thus, 
Theil’s  estimator  is  not  an  SMLE.  It  is  easy  to  verify  that  the  LSE  is  not  an  SMLE  neither.  □ 
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Remark  1.1.  It  is  worth  mentioning  that  in  Example  1.1,  one  may  modify  the  likelihood  fimction  in  (1.1) 
as  follows: 

n 

L  =  n  fiYi  -  f>Xi),  /  =  F'  and  F  e  00,  (1.3) 

i=l 

where  ©o  is  a  subset  of  certain  absolutely  continuous  cdf ’s.  Then  the  values  of  (F,  b)  that  maximizes  L  over 
all  b  and  over  all  F  G  ©o  can  be  called  a  modified  SMLE  or  repaired  SMLE.  But  it  has  not  been  called  the 
SMLE  in  the  literatme.  This  modification  is  the  motivation  of  the  M-estimation  approach  and  the  approach 
based  on  score  fimctions. 

More  recently,  Yu  and  Wong  (16)  studied  the  SMLE  with  right-censored  data.  Simulation  suggests  that 
the  SMLE  $  is  consistent  and  lim„_,.oo  TiVar0)  =  0  if  Fo  is  discontinuous.  The  property  was  proved  under 
a  discrete  assumption  that  (X,  Y)  takes  on  finitely  many  values.  In  contrast,  all  the  existing  estimators  do 
not  have  the  second  property.  Under  continuous  assumptions,  the  SMLE  with  interval-censored  data  may 
attain  the  efficient  lower  bound  (see  Cosslett  (17)). 

In  this  paper,  we  shall  study  how  to  derive  an  SMLE  with  interval-censored  data.  In  Section  2,  we 
define  the  SMLE.  In  Section  3,  we  propose  feasible  algorithms  to  obtain  all  possible  SMLE’s.  In  Section  4, 
we  present  simulation  results.  The  simtilation  results  indicate  that  the  SMLE  is  computationally  feasible, 
is  consistent  and  its  standard  error  can  be  estimated  by  the  bootstrap  method.  In  Section  5,  we  apply  our 
procedure  to  four  data  sets  and  compare  to  the  LSE  or  the  BJE.  Several  comments  are  made  in  Section  6. 

2.  THE  SMLE 

Let  (Yj,  i  =  1,  ...,  n,  be  i.i.d.  copies  of  (Y,X,  e,  L,i?).  Denote  random  intervals 

T  T  _  f  (Fj  —  bXi^  Ri  —  bXi]  if  Li  <  iij, 

it  -  it  w  -  \  [Li  -  bXu  Ri  -  bXi]  if  Li  =  Ri, 

Note  that  L  is  a  singleton  if  Li  =  R^.  Denote  the  measure  induced  by  F  G  F,  where  F  is  the  class  of  all 
distribution  functions.  In  other  words, 

J  -  I  Fiv)  -  F(u-)  if  7  =  [w,  v]. 

By  assumptions  Al  and  A2,  Ii{l3)  are  Li.d.  random  intervals  and  Cj  G  7i(/3).  Thus  the  generalized 
likelihood  function  defined  by  Kiefer  and  Wolfowitz  (14)  is 

n 

L(F, 6)  =  J] MF(7i(6)),  FeT,  ben  (=  (-00, 00)).  (2.1) 

t=l 

A  semi-parametric  MLE  of  (F^,/?)  maximizes  L.  Given  6,  the  GMLE  of  Fo  based  on  observations  /t(6)’s 
maximizes  L(*,6)  over  F  £  T,  Thus,  in  order  to  find  the  SMLE  of  (Fo,/?),  it  suffices  to  maximize 

/(6)  =  L(F6,6),  6G7^.  (2.2) 

The  SMLE  of  ^  may  not  be  unique  (see  Example  3.1).  Denote  the  set  of  all  solutions  of  the  SMLE  of  ^  by 
B.  Then  each  Fj,  6  G  B,  is  an  SMLE  of  Fq. 

Under  assumption  Al,  a  =  E{e)  may  not  exist.  If  it  does  exist,  a  can  be  estimated  by  d  =  /  tFb{t), 
where  b  E  B.  However,  even  if  i»  =  ^8,  d  is  not  consistent  unless  Y  is  observable  (with  positive  probability) 
everywhere  on  its  range,  or  P(y  is  not  censored|y  =  t)  >  0  for  all  possible  t.  Thus,  in  general,  a  is  not 
identifiable  under  censoring  (see,  e.g,,  Buddey  and  James  (4)).  We  shall  ignore  d  in  our  study. 

3.  METHODS 

In  this  section,  we  shall  first  introduce  an  algorithm  which  guarantees  to  obtain  all  solutions  of  the 
SMLE  and  then  introduce  another  algorithm  which  is  faster,  but  offer  no  proof. 
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In  view  of  (2.2),  in  order  to  find  the  SMLE,  it  snflBces  to  compare  l{b)^  b  G  Let  Ui{b)  and  Vi{b)  be 
the  endpoints  of  the  interval  Ii{b),  z.e.,  Ui{b)  =  Li  —  bXi  and  Vi{b)  =  Ri  —  bXi.  Let  T2i-i{b)  =  Wi(6)  and 
T2i{b)  =  Vi{b)y  i  =  1,  n.  Then  we  can  determine  the  ranks  of  the  2n  extended  random  variables.  Given 
6,  it  is  well  known  that  L{Fb^b)  only  depends  on  the  ranks  of  the  2n  Ti(6)’s  (see  Turnbull  (18)).  That  is, 


l{bi)  =  11(62)  if  the  rank  of  Ti{bi)  is  the  same  as  the  rank  of  Ti{b2)  for  each  i.  (3.1) 

Note  that  the  ranks  of  these  Ti(6)’s  will  change  only  at  the  solutions  of  the  equations  Ti{b)  =  Tj{b), 
i  ^  j.  The  latter  equations  yield  equations  of  forms 

Li  -  bXi  =  Lj  -  bXj,  Li  -  bXi  =  Rj  --  bXj  or  R^-  bXi  =  Rj  -  bXj,  (3.2) 

where  L^,  Ri,  Lj  and  Rj  are  finite,  and  Xi  ^  Xj,  Since  there  are  at  most  4n^  equations  of  forms  in  (3.2), 
there  are  at  most  4n^  distinct  solutions  to  these  equations,  denoted  by  bi  <  *  ‘  *  <  bm*  Let  60  =  “00  and 


bm+i  =  00.  By  construction,  if  6  G  {bk^bk-\-i),  then  Ti(6)’s  will  not  change  their  ranks.  Consequently,  it 
follows  from  (3.1)  that 

for  each  k,  l{b)  (=  L{Fb,b))  is  constant  on  the  open  interval  (6fc,6fc+i)-  (3.3) 

There  are  m  + 1  disjoint  open  intervals  of  form  (6^,  6fc+i)  and  m  disjoint  closed  intervals  of  form  [6^,  6^]. 
As  a  consequence,  there  are  at  most  2m  + 1  distinct  values  of  l{b),  which  can  be  represented  by  l{bj),  j  =  1, 
2m  + 1,  where  =  6i  -  1,  =  6^  +  1,  =  bi,i  =  1,  m,  and  =  (6<  +  bi+i)/2,  i  =  1, 

m  —  1.  Denote 

=  (3.4) 

In  order  to  find  an  SMLE  of  /3,  it  suflSces  to  compare  l{b)  for  b  £  Ai-  Let  be  the  set  of  all  points  in 
Ai  that  maximize  l{b),  b  G  Ai.  The  C  B,  Moreover,  if  ^  C  B  by  (3.3).  To 

summarize,  we  have  the  following  algorithm: 

Algorithm  3.1  (for  obtaining  all  solutions  of  the  SMLE  of  /3): 

(1)  Derive  the  set  Ai  (see  (3.4)). 

(2)  Compute  Fi,  and  l{b),  where  6  G  .Ai.  For  computing  F5,  we  refer  to  TurnbuU  (18)  or  Wellner  and  Zhan 
(19).  Then 

6  G  B  iff  either  (1)  6  G  B^  or  (2)  b  G  (62^,627-1-2)  for  a  b^j^i  E  B^. 

Each  0,F0),  /3  G  B,  is  an  SMLE  of  {^,Fo)- 

If  the  SMLE  is  not  unique,  one  can  make  the  following  choices:  (1)  choosing  an  SMLE  that  is  closest 
to  the  median  of  B;  (2)  choosing  an  SMLE  that  is  closest  to  the  median  of  B^;  (3)  choosing  an  SMLE  that 
is  closest  to  the  BJE.  The  second  choice  is  the  easiest  one  to  implement. 

It  is  often  time-consuming  to  compute  the  GMLE  Thus  it  is  desirable  to  reduce  the  distinct  values 
of  6  involved  in  Step  2  of  Algorithm  3.1.  The  following  algorithm  reduces  the  number  of  6  involved  by  a 
factor  of  4. 

Algorithm  3.2. 

1.  Find  the  solution  6  to  the  equation  of  form  Li  —  bXi  =  Lj  —  bXj  or  R^  —  bXi  =  Rj  —  bXj,  where  Li,  Lj, 
Ri  and  Rj  are  finite,  and  Xi  /  Xj.  Let  A2  be  the  set  of  all  the  distinct  elements  of  these  solutions. 

2.  Compute  F^  and  Z(6),  6  G  A2»  Then  identify  those  points  in  A2  which  maximize  l{b),  b  G  A2  and  denote 
B*  the  collection  of  these  points. 

3.  Choose  b  E  B*  that  is  closest  to  the  median  of  B*.  Treat  it  as  the  SMLE  of  p. 

This  algorithm  is  faster  and  our  simulation  results  suggest  that  if  n  is  large  then  it  yields  an  SMLE.  To 
accelerate  the  algorithm,  we  may  further  make  the  following  modifications: 

(i)  Randomly  select  a  subset  J  of  the  set  {{i,j)  :  1  <  i  <  j  <  n}.  Let  A2  be  the  collection  of  the  solutions 
6  to  the  equation  of  form  Li  —  bXi  =  Lj  -  bXj  or  Ri—  bXi  =  Rj  —  bXj,  where  {i,j)  G  J. 

(ii)  Modify  A2  as  .A3  =  .42  n  [a,  6],  where  [0,6]  is  an  interval  to  which  we  suspect  that  13  belongs.  For 
example,  we  may  let  [a,  b]  —  [$  —  3a $  +  3a ^],  where  $  is  the  BJE  and  a^  is  the  standard  error  (SE) 

of^. 
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Note  that  Step  (i)  reduces  the  amount  of  (i^j)  in  calculation,  and  Step  (ii)  further  reduces  the  cost  in 
computing  the  GMLE  of  F  for  selected  b  G  As- 

The  following  example  illustrates  the  difference  between  the  two  algorithms. 

Example  3.1,  Suppose  that  there  are  3  observations  (Li,iii,Xi)’s.  They  are  (1,4,1),  (3,6, —1),  (1,2,1). 
We  first  consider  Algorithm  3.1.  Step  1  results  in  m  =  5, 

{biM^h^hjbs)  =  (-2.5, -2, -1, -0.5, 0.5)  and 


Ai 

= 

-2.5, -2.25,  -2, -1.75, 

-1,  -0.75,  - 

0.5, 0,0.5, 1.5}. 

In  Step 

2, 

we  need  to  compute  Ft 

and  l{b),  b  6  Ai.  For 

simplicity,  we  only  demonstrate 

for  b"  and  63. 

(1) 

bi 

=  —3.5.  Then  the  Ji’s 

are  (4.5, 7.5],  (-0.5,2 

.5],  (4.5, 5.5]. 

■P-3.5(<)  = 

+  |l(<>5)-  ^(^1)  = 

i^2\2 

(2) 

bi 

=  —2.25.  Then  the  I 

i’s  are  (2.25,6.25],  (0. 

75,3.75]  and 

[  (2.25,4.25].  P’_2.25 

;(t)  =  l(t>3.5)  and 

II 

1. 

To 

summarize.  Step  2  gives  the  following  results: 

i:  1  2 

3  4  5 

6  7 

8  9  10 

11 

6?:  /-3.5  -2.5 

-2.25  -2  -1.75 

-1  -0.75 

-0.5  0  0.5 

1.5  \ 

^(^i)  •  \  ^  4f 

1  1  1 

1  1 

0.25  0.25  0.25 

0.25  ) 

Thus  B  =  (—2.5,  —0.5)  and  =  {—2.25,  —2,  —1.75,  —1,  —0.75}.  In  this  example  there  are  infinitely  many 
SMLE’s.  We  choose  the  median  of  B®,  that  is,  /3  —  —1.75. 

If  we  use  Algorithm  3.2,  Step  1  results  in  ^2  =  {—2,  -1}  and  Step  2  results  in  an  estimate  $  =  — 1. 
Both  algorithms  lead  to  an  SMLE  and  the  second  algorithm  is  obviously  faster. 

Remark  3.1.  Variance  estimation  is  important  for  making  inferences.  Since  we  do  not  make  any  specific 
assumption  on  Fo^  there  are  two  possibilities  for  the  variance  of 

1.  Under  interval  censoring  without  exact  observations,  if  the  regularity  conditions  stated  for  the 
Cramer-Rao  lower  boimd  hold  and  Fp  is  differentiable,  then  a  consistent  estimator  of  the  eflScient  lower 
bound  for  the  regression  problem  is 


vE( 


MW) 


{Xi  X)) 


(3.5) 


where  «  0,  e.g.,  h„  =  n 

2.  If  P’0  is  differentiable  but  or  if  Fo  is  not  continuous,  then  the  regularity 

conditions  stated  for  the  Cramer-Rao  lower  bound  do  not  hold.  In  such  situations,  is  not  a  good  estimator. 

In  view  of  the  above  discussion,  we  suggest  to  use  the  bootstrap  method  (see,  e.^.,  Davison  and  Hinkley 
(20))  to  estimate  the  variance  of  /9.  It  seems  to  us  from  simulation  that  the  bootstrap  estimator  is  consistent. 
4.  SIMULATION  RESULTS 

Hereafter,  we  denote  $  the  SMLE.  The  simulation  results  suggest  that  $  is  consistent  and  the  bootstrap 
estimates  of  the  variances  of  $  match  the  sample  variances.  The  simulation  results  also  indicate  that  the 
SMLE  is  a  feasible  procedure  computationally. 

In  our  simulations,  we  make  use  of  a  Case  1  or  Case  2  interval  censorship  model.  Under  the  Case  2 
interval  censorship  model,  the  observable  random  vector  satisfies 


{(-oo,U)  MYKU 
({7,J7  +  V)  ifZ7<y  <  U  +  V 
(C7  +  v;oo)  ify>t7  +  v, 


y  >  0,  and  y  and  (C7,  V)  are  independent.  Under  the  Case  1  interval  censorship  model,  the  observable 
random  vector  satisfies 


{-oo,U) 

{U,oo) 


i£Y<U 

ifY>U, 


and  Y  and  U  are  independent. 
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In  our  simulation,  we  further  assume  that  the  underlying  distribution  of  Y  and  the  censoring  vector 
satisfy  the  following  conditions:  (1)  ?7  is  a  continuous  nonnegative  random  variables;  (2)  F  is  a  nonnegative 
continuous  random  variable;  (3)  X,  e,  U  and  V  are  independent. 

Note  that  in  our  assumptions,  e  and  X  may  be  continuous  or  discontinuous.  We  present  simulation 
results  under  three  different  sets  of  distributions.  Under  each  set  of  assumptions,  we  compute  $  based  on 
sample  sizes  30  and  100,  respectively,  with  500  simulations  in  each  case.  The  program  was  written  in  C 
language  and  the  simulation  was  carried  out  on  a  Pentium  III  PC. 

Case  1.  (Case  2  interval  censoring).  Assume  that  e  has  the  xmiform  distribution  on  the  interval  (0,2); 
U,  V  and  X  have  exponential  distributions;  and  /3  =  1. 

Case  2.  (Case  2  interval  censoring).  Assume  that  e  takes  values  0.8  and  4  w.p.  0.8  and  0.2,  respectively; 
X  is  a  discrete  random  variable  which  takes  values  i  w.p,  i/15,  i  =  1,  2,  3,  4,  5;  ==  1;  and  U  and  V  have 

uniform  distributions  on  the  intervals  (0,4)  and  (0.5,5),  respectively. 

Case  3.  (Case  1  interval  censoring).  Assume  that  e  has  a  uniform  distribution  on  the  union  of  intervals 
(0, 0.5)  U  (50.5, 51);  the  inspection  time  U  has  a  uniform  distribution  in  the  interval  (0,3);  X  takes  values  0.5 
and  1  w.p.  0.5  and  0.5,  respectively;  and  ^=1, 


Table  1.  Simulation  Results  on  estimating  with  interval- censored  d 

ata. 

cases 

n  =  30 

n  =  100 

Case  1. 
continuous  Fq 

$n  average  (SE) 
average  ses 

SE  of  seB 

1.020  (0.710) 
0.489 
(0.264) 

1.060  (0.401) 
0.319 
(0.203) 

■ 

Case  2. 
discrete  Fo 

$ri  average  (SE) 
average  ses 

SE  of  scb 

1.015  (0.279) 
0.286 
(0.067) 

1.007  (0.073) 
0.098 
(0.013) 

■ 

Case  3. 
continuous  Fq 

$„  average  (SE) 
average  ses 

SE  of  sbb 

0.824  (1.355) 
0.430 
(0.336) 

0.986  (0.391) 
0.275 
(0.098) 

1 

In  Table  1,  the  entries  in  the  column  corresponding  to  n  stand  for  the  results  with  sample  size  n.  The 
results  in  Table  1  suggest  that  p  is  consistent  in  all  the  three  cases.  Verify  that  ^  £^(  )  in  all 

the  three  cases,  thus  we  cannot  use  d-|  (see  (3.5))  to  estimate  the  variance  of  ^0.  In  each  simulation,  using 

the  bootstrap  method  described  in  Efron  and  Tibshirani  (21)  or  Davison  and  Hinkley  (20),  we  resampled 
(with  replacement)  B  times.  Each  resample  size  is  n.  The  bootstrap  estimate  of  a denoted  by  se^,  is  the 

sample  standard  error  (SE)  of  the  B  estimates  /?  based  on  the  B  resamples.  As  suggested  by  Efron  and 
Tibshirani,  we  set  B  =  30.  The  entries  corresponding  to  the  row  of  “average  sejs”  and  the  row  of  “SE  of 
scb”  are  the  sample  means  and  the  sample  standard  errors  of  these  ses  iu  the  500  simulations,  respectively. 
The  differences  between  the  sample  SE’s  of  ^  and  the  bootstrap  estimates  ses  are  within  2  standard  errors 
for  the  three  cases,  except  for  Case  3  when  n  =  30.  This  suggests  that  the  bootstrap  estimator  ses  of  the 
is  appropriate. 

We  also  carried  out  simulation  under  the  assumption  that  e  has  a  normal  distribution  or  an  exponential 
distribution.  The  results  are  similar  and  are  not  reported  here. 


5.  APPLICATIONS 

In  this  section,  we  apply  omr  method  to  several  data  sets,  including  complete  data  {L  =  R  w.p.l), 
right-censored  data  and  interval-censored  data.  We  also  compare  the  SMLE  to  the  BJE,  which  is  also  an 
M-estimator  (see  Li  and  Zhang  (12)). 

Example  5.1.  (Magazine  advertising  (Chatterjee  and  Price  ((22),  p.  257)).  In  a  study  of  revenue  from 
advertising,  data  were  collected  for  41  magazines  in  1986.  There  was  no  censoring.  Let  X  denote  the  number 
of  pages  of  advertising  and  Y  the  advertising  revenue. 
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The  41  data  are  plotted  in  Figure  1.  Roughly  speaking,  there  axe  three  outliers  in  the  data  set.  They 
are  (25,  50),  (15,  49.7),  (77,  6.6).  The  SMLE  of  ^  is  imique  for  this  data  set.  The  SMLE  and  the  LSE  are 
significantly  different  (see  the  first  block  of  Table  2).  The  entries  in  the  second  block  of  Table  2  are  results 
after  deleting  the  three  outliers.  From  Table  2,  it  is  seen  that  the  SMLE  of  does  not  change  after  deleting 
outliers,  though  the  estimate  of  a  changes. 

In  Figm:e  1,  we  also  plot  the  fitted  straight  lines  with  and  without  deleting  those  three  outliers.  We 
further  plot  the  fitted  line  by  the  SMLE  method  without  deleting  the  outliers.  From  Figure  1,  it  is  seen  that 
the  fitted  line  by  the  SMLE  approach  using  the  original  data  is  very  close  to  the  least  squares  fitted  line 
after  deleting  outUers.  This  suggests  that  the  SMLE  is  robust  while  the  LSE  is  not. 


Table  2.  Results  on  estimating  (a,  P) 

SMLE  (SE) 

LSE  (SE) 

with  outliers 

1.200  (0.196) 

0.353  (0.1449) 

a 

-1.427  (3.178) 

7.604  (2.3466) 

without  outliers 

1.200  (0.1379) 

1.238  (0.138) 

a 

-0.642  (1.410) 

-0.962  (1.409) 

Example  5. 2*  (Application  to  the  Stanford  heart  transplant  data).  The  data  and  detailed  description  can 
be  found  in  Miller  ((5),  p.  156).  In  this  data,  right-censored  survival  time,  indicator  of  death,  indicator  of 
rejection,  2  covariates  including  T5  mismatch  and  age  of  the  recipient  at  time  of  transplant  were  recorded  for 
69  patients.  For  illustrative  purposes,  Buckley  and  James  (4)  compared  their  method  to  Miller’s  estimator 
and  Cox  method  by  fitting  a  simple  linear  regression  for  all  death  against  age.  We  compare  our  method  to 
the  B JE  using  the  same  data  under  the  log  linear  regression  model. 

Our  algorithm  results  in  an  SMLE  ^  =  —9.77  with  a  standard  error  (SE)  1.56.  The  SMLE  is  significantly 
negative  and  is  consistent  with  om  a  priori  guess,  namely,  yoxmger  patients  fare  better  in  the  surgery. 

There  are  3  BJE’s  of  13:  —0.028,  0.014  and  0.016  with  the  SE  0.015  (only  the  first  estimate  and  its  SE 
were  reported  in  Buckley  and  James  (4)).  They  are  all  not  significant  at  level  0.05.  This  means  that  there 
is  no  effect  due  to  age. 

In  this  example,  the  SMLE  make  more  practical  sense  than  the  B  JE,  and  thus  is  more  reliable  than  the 
BJE. 
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Example  5. 3*  (Application  to  a  cancer  research  data).  Our  data  analysis  is  applied  to  a  standard 
breast  cancer  relapse  follow-up  study  based  on  data  from  374  women  with  stages  I  -  III  unilateral  invasive 
breast  cancer  surgically  treated  at  Memorial  Sloan-Kettering  Cancer  Center  between  1985  and  1990.  The 
median  follow-up  duration  was  46  months.  Relapse  time  was  given  by  the  time  interval  between  surgery 
and  the  initial  relapse.  A  relapse  that  took  place  between  two  successive  follow-up  visits  was  regarded  as 
interval  censored.  If  a  patient  did  not  relapse  toward  the  end  of  the  study,  then  her  relapse  time  was  right 
censored.  Of  the  374  observations,  300  were  right  censored  (no  relapse),  21  were  left  censored  and  53  were 
strictly  interval  censored.  Bone  marrow  micrometastasis  (BMM)  was  determined  for  each  woman  at  the  time 
of  surgery.  An  important  question  is  whether  remission  duration  is  related  to  the  extent  of  initial  tumor 
burden  defined  as  number  of  BMM  cells  detected.  We  compute  the  B  JE  and  imder  the  log  linear  regression 
model. 

One  expects  that  the  larger  the  BMM  was,  the  shorter  the  patient  survived.  Thus  ^  <  0.  The  BJE  of 
/?  is  —0.012  and  our  estimate  is  =  —0.059  with  a  (bootstrap)  standard  error  0.02.  Note  that  the  BJE  does 
not  fall  in  the  interval  0  —  2SE0  -f  25E),  so  they  are  significantly  different.  Moreover,  the  SMLE  leans 
more  toward  the  correct  direction  than  the  BJE. 

Example  5.4.  (Application  to  a  breast  cosmesis  data).  We  applied  our  procedure  to  the  interval-censored 
data  set  published  in  Finkelstein  and  Wolfe  (23).  The  data  is  a  result  of  a  retrospective  study  to  compare  early 
breast  cancer  patients  who  have  been  treated  with  primary  radiation  therapy  and  a  adjuvant  chemotherapy 
to  those  treated  with  radiotherapy  alone  with  respect  to  the  cosmetic  effects  of  their  treatment.  There  is 
only  one  covariate,  the  group  status,  in  this  data  set.  There  are  94  patients. 

The  BJE  of  ^  is  —0.29  and  our  estimate  ^  is  —0.67  with  a  (bootstrap)  standard  error  0.336.  The  BJE 
falls  in  the  interval  0  —  25J5,  ^  +  2SE),  Thus  they  are  not  significantly  different. 

6.  CONCLUSION 

In  this  paper,  we  propose  algorithms  for  the  SMLE  of  The  procedure  is  actually  applicable  to 
complete  data  or  right-censored  data,  as  demonstrated  in  Examples  5.1  and  5.2. 

We  believe  that  each  SMLE  of  is  consistent  under  interval  censoring.  The  outline  of  the  proof  is  as 
follows. 

(1)  By  the  definition  of  the  SMLE  and  the  strong  law  of  large  number, 

lim  (lnL(FA,4))/n  >  E(lnL(jPo, y3))/n  a.s.. 

n— foo  ^ 

(2)  By  Fatou’s  Lemma,  J[^(lnL(JP^,/3))/n  <  f7(liiL(F»,fe,))/n  a.s.,  where  (F,,fe*)  is  the  limit  of  a  conver¬ 
gent  subsequence  of  the  SMLE 

(3)  jB(lnL(F, 6))/n  <  J5(lnL(Fo,)8))/n  for  each  (F,6),  and  the  equality  holds  only  if  6  =  by  the  Shannon- 

Kolmogorov  inequality  and  the  following  assumption: 

A3.  FeTajidF{F{W-bX)  =  FoiW-0X)foTW  =  LoTR}  =  l  imply6  =  ^. 

(The  assumptions  made  in  Section  4  satisfy  A3.) 

Statements  (1)  and  (2)  imply  that  E(lnL(F*,6*)yn  =  jE?(lnL(Fo,/3))/n,  thus  statement  (3)  implies  that 
6*  =  /3.  Since  K  is  an  arbitrary  limiting  point  of  is  consistent. 

In  general,  the  BJE  is  not  efficient  and  is  not  robust,  as  the  BJE  reduces  to  the  LSE  in  the  case  of 
complete  data  and  the  LSE  is  not  efficient  and  is  not  robust  (see  Draper  and  Smith  ((24),  p.  342)).  We 
^PPfy  both  the  BJE  procedure  and  the  SMLE  to  complete  data,  right-censored  data  and  interval-censored 
data.  The  SMLE  seems  quite  robust  and  always  give  reasonable  estimates.  The  BJE  may  give  unreasonable 
estimates. 

For  the  semi-parametric  set-up,  there  are  several  estimation  procedures,  e.g.,  the  BJE  and  the  M- 
estimators.  They  are  all  obtained  by  iterative  algorithms.  The  cmrent  procedure  is  also  obtained  by  an 
iterative  algorithm,  unless  the  data  are  Case  1  interval-censored  data.  In  the  latter  case,  the  SMLE  can  be 
obtained  by  a  non-iterative  algorithm,  as  the  GMLE  has  closed-form  solution  (see  Ayer  et  al  (25)).  The 
reason  is  as  follows.  There  are  two  steps  in  obtaining  all  SMLE’s:  (1)  finding  all  the  discontinuity  points  6i, 
...,  bm  <  4n^,  see  (3.4)),  and  (2)  computing  the  GMLE  F^  and  comparing  Z(6),  b  =  6i’s  (see  algorithm 
3.1).  Since  the  GMLE  with  Case  1  interval-censored  data  has  a  closed-form  expression,  all  the  SMLE’s  of 
0  can  be  obtained  in  finitely  many  steps.  However,  with  Case  1  interval-censored  data,  the  BJE  and  the 
M-estimates  cannot  be  obtained  by  a  non-iterative  algorithm. 
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We  only  discuss  the  simple  linear  regression  model  in  this  paper.  The  method  can  be  extended  to  the 
multiple  linear  regression  model.  However,  the  computation  can  only  performed  on  high-speed  computers. 
Also,  only  extension  of  Algorithm  3.2  together  with  Steps  (i)  and  (ii)  is  feasible  due  to  the  heavy  computation 
cost. 
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Bone  marrow  micrometastasis  is  a  significant  predictor  of  long-term  relapse-free  survival  for 
breast  cancer  by  a  non-proportional  hazards  model. 

George  YC  Wong  Ph  D  1,  Qiqing  Yu  Ph.D  1  and  Michael  P  Osborne,  MD  1.1  Preventive  Oncology 
Research,  Strang  Cancer  Prevention  Center,  New  York,  NY,  United  States,  10021  . 

Background:Long-term  predictive  significance  of  the  presence  of  bone  marrow  micrometastases 
(BMM)  on  breast  cancer  relapse  is  a  substantively  important  question  for  clinicians.  Two  published 
long-term  studies,  the  Royal  Marsden  study  (Lancet  1999;  354:197-200)  and  our  recent  ASCO 
abstract  (Proc  Am  Soc  Clin  Oncol  2002,21:228),  have  both  concluded  that  BMM  was  not  a 
significant  predictor  of  relapse-free  survival  (RFS)  by  Cox  proportional  hazards  (PH)  regression 
analysis.  However  the  RFS  curves  comparing  presence  and  absence  of  BMM  were  separated  in  both 
these  data  sets.  Diagnostic  plots  for  PH  assumption  indicated  that  om*  RFS  data  were  not  consistent 
with  such  an  assumption  Consequently,  Cox  regression  was  inappropriate  for  the  data  and  the 
logrank  test  was  invalid.  We  analyzed  our  BMM  data  using  a  semi-parametric  non-PH  regression 
model. 

Material  and  Methods:  BMM  was  determined  using  monoclonal  antibodies  to  cytokeratin  at 

the  time  of  initial  surgery  in  375  women  with  unilateral  T1-2NO  (56%),  T1-2N1  (43%)  and 
T3-4(l%)  breast  cancer.  Relapse  time  was  interval  censored  between  two  successive  follow-up 
times.  RFS  data  were  analyzed  using  a  regression  model  with  a  nonparametric  error  distribution  that 
does  not  involve  PH  assumption.  Statistical  inference  was  based  on  a  semi-parametric  maximum 
likelihood  estimation  procedure. 

Results:  Median  follow-up  was  8  years  (range  1  month-15  years).  BMM  was  detected  in  124  (35%) 
patients  Contingency  table  analysis  showed  that  BMM  did  not  correlate  with  the  standard  prognostic 
variables  of  lymph  node  status  (LN),  tumor  diameter  (TD),  estrogen  and  progesterone  receptor 
levels.  In  the  rmivariate  non-PH  RFS  analysis,  BMM  was  significant  at  p=0.05.  In  the  multivariate 
analysis,  BMM  was  still  significant  at  p=0.05  in  the  presence  of  LN  and  TD.  hi  contrast,  only  LN 
and  TD  were  needed  in  the  multivariate  analysis  by  Cox  regression. 

Discussion:  Recent  published  long-term  studies  using  Cox  regression  have  led  to  result  that  BMM 
is  not  a  significant  predictor  for  RFS.  Using  a  novel  non-PH  regression  model,  we  were  able  to 
demonstrate  BMM  as  a  statistically  independent  significant  predictor  of  RFS  in  a  multivariate 
regression  model  incorporating  both  LN  and  TD  as  covariates. 


